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PREFACE 


The effort to understand human behavior must itself be one of tlie 
oldest of human behaviors But for all the centuries of effort, there is 
no compelling evidence to convince us that we do understand human 
behavior very well Instead, there are the unsolved behavioral problems 
of mental illness racism, and violence, of both the idiosyncratic and 
institutionalized varieties, to bear witness to how much there is we do 
not yet know about human behavior In the face of the urgency of the 
questions waiting to be answered it should not be surprising that 
behavioral scientists, and the publics that support them, should suffer 
from a certain impatience That impatience is understandable, but per- 
haps from time to time we need remmd ourselves that we have not 
really been in business for very long 
The application of that reasorang and of those procedures which 
together we call the scientific method to the understanding of human 
behavior is of relatively very recent ongm What we have learned about 
human behavior in the short period, say from the founding of Wundt’s 
laboratory in Leipzig m 1879 until now, is out of all proportion to 
what we learned in precedmg centuries The success of the apphcation 
of ‘scientific method to the study of human behavior has given us new 
hope for an accelerating return of knowledge on our investment of time 
and effort But most of what we want to know is still unknown The 
applveaUon of what we think of as soientifio method has not simplified 
human behavior It has perhaps shown us more clearly just how complex 
it really is 

In contemporary behavioral research it is the research subject we tiy 
to understand He serves as our model of man in general or at least 
of a certam kind of man We laiow that his behavior is complex We 
know it because he does not behave exactly as does any other subject 
We know it because sometimes we change his world ever so slightly 
and observe his behanor to change enormously We know it because 
sometimes we change his world greatly and observe his behavior to 
change not at all Wc knoiv it because the “same” careful expenment 

Vll 







conducted m one phce at one time often yields results ™ry different 
from one conducted m another place at another time We know his 
complexity because he is so often able to surpnse us with his behavior 
Much of the complexity of human behavior may be in the nature ot 
the organism But some of this complexity may derive from the social 
nature of behavioral research itself Some of the complexity of man as 
•we know it from his model, the research subject, may reside in the 
fact that the subject usually knows perfectly well that he is to be a 
research subject and that this role is to be played out in interaction 
with another human being, the investigator 

That portion of the complexity of human behavior which can be 
attributed to the social nature of behavioral research can be con- 
ceptualized as a set of artifacts to be isolated, measured, considered, 
and, sometimes, ehmmated This book is designed to consider in detail 
a number of these artifacts The purpose is not simply to examine the 
methodological implications, though that is an important aspect, but 
also to examine some of the substantive implications It may be that 
all “artifacts,” when closely examined, teach us something new about 
a topic of substantive mterest 

The introductory chapter, which was wntten by our late colleague 
Edwan G Bonng provides a pcrsjiective on artifact and a discussion of 
the nature of experimental control The following six chapters are a 
senes of position papers by researchers who have been actively engaged 
in systematic exploration of various antecedents of artifact m behavioral 
research, and each wnter summarizes the findmgs m his respective 
area Those six essays, in the order of their presentation, are by 
Wilham J McGuire on suspiciousness of intent, Robert Rosenthal and 
Ralph L Rosnow on volunteer effects, Robert E Lana on pretest 
sensitization, Martin T Ome on demand characteristics, Rosenthal on 
experimenter expectancy effects, and Milton J Rosenberg on evaluation 
apprehension The final chapter, by Donald T Campbell, takes into 
account the separate contributions and tells us something of the future 
prospects for behavioral researdi 

In organizing this volume, the editors have been gmded by Herbert 
Hyrnans comment that the demonstration of systematic error may well 
mark an advanced state of a science 


AU scicnhac mqmiy is subject to error, and it is far better to 
bo aware o! this, to study the sources m an attempt to reduce 
It, and to estimate the magnitude of such errors m our ffndmgs, 
than to be ignorant ot the errois concealed in the data One 
must not equate ignorance ot error with the lack of error The 
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lack of demonstration of error in certain fields of inquiry often 
derives from the nonexistence of methodological research into 
the problem and merely denotes a less advanced stage of that 
profession.* 

The editors thank Academic Press for their patience and continued 
interest throughout the two and one-half year evolution of this book. To 
our contributors — Boring, Campbell, Lana, McGuire, Ome, and Rosen- 
berg — ^we are indebted for their thoughtful and thought-provoking 
essays. Our task was greatly facilitated by separate grants to each of us 
from the Division of Social Sciences of the National Science Foundation. 


Edwin G. Boring, who passed away on July 1, 1968, wrote once of 
the sense of inadequacy of the individual scholar to available information 
at any moment of his existence and of the feeling sometimes of being 
overwhelmed by the complexity of nature. 

That would explain Kepler’s looking for a geometrical gen- 
eralization to explain the planets in the solar system, would give 
a sound basis for the need for all generalization in science. And 
nowadays we no longer hope to leam about everything that's 
in nature, but only about everything that's already been 
published about nature, and ultimately we sink, gasping . . . 

Still, I am content to live in this age. Titchener once said 
diat he would have liked to live in the age when one man 
could know everything, and that was quite long ago. We must 
accustom ourselves to an age in which one man never knows 
more than just enough to use for a given purpose.f 

How fortunate our age was to have Edwin G. Boring. We take pride 
in dedicating this book to his memory. 

January 29, 1969 Robert Rosenthal 

Ralph L. Rosnow 

• Hyman, Herbert H. Interviewing in social research. Chicago: University of 
Chicago Press, 1954. P. 4. 

1 Personal communication, November 1, 1967. 
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Chapter 1 


PERSPECTIVE: 
Artifact and Control 


Edwin G. Boring 
Harvard University 


I. THE CONCEPT OF CONTROL 

If then y. That is the ferula for John Stnart meth^ of 

agreement. The independent variable is * and 
that * is a sufficient condition of y. and the f 
of this relation has sometimes been thought to be the aim . 

The statement is, however, not enough. It must J 

not-r, then not-y, and the two formulas ‘°gf « 

method of agreement and difference, establishmg the ‘odependent va 

able, a. as both the sufficient and the necessary of Rj 

essential to add the method of difference to niethod of agreem.m 

in order to establish . as necessary to y as 

1843, Bk. Ill, chap. 8.) In short not-x ^ the control. 

condition only if it be shown that y does not 

sense Mill wL a good expositor for the concep of . 

he did not use the term. (On the history of control, see Bormg. 19o4, 

' had not been overlooked before “ 

1648 planned the of the Puy-de-Dome 

^h“ir:f rospheric^ressure at the 
he provided also for a second barometer which was kept tne ^ 
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of tlie mountain and \\as found to remain unchanged (Pascal 1937, 
97-112, Cohen, 1948, 71f. Conant, 1931, 39 Bonng 1954, 577f, 1963, 
115 ) It was really a control The independent variable was the height, 
the dependent ^a^able was the atmospheric pressure, and the procedure 
was the joint method of agreement and diffeience — two centuries before 
Mill had named it and laid down the rules 
Actually it is the use of the method of difference, that is to say, 
of control, that puts ngor mto science A fact is a difference Something 
IS this and not that Any observed value has meaning only m relation 
to some frame of reference, and any quantity only m respect of the 
scale in which it is set Lana (Chapter 4) makes this point in his paper 
in the present volume The method of concomitant variations, which 
Mill gave os a separate method, is really an elaboration of his method 
of difference, for ever) pair of values of x and y is placed in relation 
to c\ery other x-y pair from which the first pair differs The paradigm 
or concomi^nt vanations is =» f(x) and its determination is the scien- 
title ideal The joint methods of agreement and difference, tf x, then 
y and 1 / nof-T then nof-y, are really only one pair of cases in the 
method of concomitant vanations, where * is some positive value and 
science is Ihe detennmation of func- 
° !/ = «’') by the observation of 

co„;r;!at;T™pi';:d “ * 

com!oL’ili!’CT '^77 ■' bove no 

Td ord" anl7ltr the method of agreement 

calTroZli „ of an histoncal 

inductively a ccnLl vanable To establish 

has of an observable relationship, one 

( a sjnonyra for the inscrutable imom ' obonce’ 

Mnnnf , gnoHince of prognostic causality) One 


cannot control histoncal 


to be found m other dcscnpnv^^oxm„*”^ comparable examples 
gcolog) and some branchcs'^ofbiology'^ bke astronomy, 

A. I our Meanings 

Tlie word control onginally meant counter roll ^ i 

which nn\ subsequent snecnl 1t«» «« u i ^ ^^ter list against 

corrcctcvl Tims die to™ cal ° 'f "ocessary 

order to induce or mamnm"Li'fo™!^ To'^^m ” 

JrurvTeba s^^vV^alpternd 


comparable examples 


sciences hke astronomy. 
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(1) Control has long been used m the sense of maintaining constancy 
of conditions and also for checking an expenmental vanable to see if 
it IS adhering to its stated or intended specifications The artifacts with 
which this volume is pnmarily concerned are mostly of this kind The 
independent vanable of y = f{x) is contaminated, often unwittingly, 
by additional unspecified determinants that affect y This is the oldest 
scientific meanmg of the word control, one which the discussions of 
the present volume re-emphasize 

(2) In the late nineteenth century the use of the control experiment 
or control test came into psychology and to a lesser extent into biology, 
although not always with that name For instance, the Hipp chronoscope 
was cahbrated by a "control hammer,” a heavy pivoted hammer which 
was released electrically, and in falling tripped successively two 
switches, wired m with the chronoscope so that the time of the hammers 
fall from one switch to the other was measured Since less variability 
could be expected of the fall-hammer than of the chronoscope, the ham- 
mer was used to calibrate the chronoscope A number of successive 
falls constituted a "control senes’ from which the 

errors of the chronoscope could be computed (Wundt, 1874, 772, 1911, 

111,367 ) , , , 

Control tests, called “puzzle expenments” (Vexirversuohe) were u^d 
in the early measurements of the cutaneous two point threshold The 
separation of the compass points placed upon the skin was varied and 
the observer reported whether he felt one or two ere e a 
that Titchener called the stimulus-error tends to make * 

subject knows that two points are always being placed on his stan. 
it becomes difficult for him to report a unitary perceptua pa ern 
cause he knows that two points are being applied (Bori g, , 
485-470, 1963, 267-271) Especially is this difficulty presen i 
subjects, hke McDougall's pnmitive people in the Torr^ 
wanted to show off their fineness of 
141-223, esp 189-193) To control this error, single points 
in with the double and that control works well to ‘hc^skill 

with which the subject can discriminate smg e rom 
It IS not, however, wholly successful in obtaining a f ™ 

perceptual pattern, for the reason that very often a 
give a very good dual for the stn^at.ons 

this independent ""™jjXl°‘been suggested that tins 

been left free to vary and does vaiy It has oeen gg 
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vanation could be caused by the presence or absence of multiple mner- 
vition at the point stimulated (Kincaid, 1918, Bonng, 1954, 579f, 1963, 
116f, also on multiple innervation. Boring 1916, 8^93) 

At the end of the centuiy Muller and Pilzecker ( 1900 ) published their 
elaborate investigation on the use of the method of nght associates 
m the study of memory They used pnncipal senes (method of agree- 
ment) and comparison senes (method of difference), what we should 
nowadays call the expenmental and control senes 
The use of the control test, experiment, or senes became almost stan- 
dard in the twentieth century, and the growth of behavioral psychology 
in which discnmmation plays the fundamental role m assessing the psy- 
chological capacities of the subject has practically put the word control 
out of common usage, for a discnmmation is the observation of a differ- 
ence and Mills joint method is now standard Lana's discussion (Chap 
ter 4) of the use of the pretest in social research shows both the impor- 
tance of control and the manner m which it often introduces an artifact 
b) changing the experimental status of the subject before the crucial 


(3) pe use of the contra! group avoids the difRculty of the pretest 
artifact by introducing another difficulty The control group and the 
expenmental group are rndependent, since they are constituted of diSer- 
ticarand d'fferent, they are not iden- 

mv matcnhcT ““'’S' 'denlical they are You 

ftc sare bod^ . ’"v “ividual, litter mates of 

L Wan ‘f subjects 

“s a tronarrurn"’'^! *=>1 there 

inscmlable dilerencT tTa neelmbr'^'’'™ ““Y 

dilemma of complementantv 7““ remain m the 

pcnse of assured equivalencT independence at the ex- 

One early example (Hantcn, 1890) of the use nt r r 
ix an expcnment demonstrating the eltechve control group 

against tetanus The enecUve immunization of mice 

controls died The experiment when inoculated and the 

tsseen a hve mome^nd a dea,1™"™™® because the difference be- 
of confidence that a dead mouse'”*' ^ acceptable level 

the calculation of a 0^2 '"‘hout 

are at hare aiuah hie shll .nharea ai lhe°deaT^^ 
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also provided some early practice, learning which could not readily 
be separated from the formal practice of the experimental group, the 
effect of which was what was being measured. It is desirable to give 
a pretest in skill A to both an experimental and a control group, and 
thL to give formal practice in skill B to the experimental group while 
the control group is being left unpracHced. After that, both groups can 
be tested and compared for improvement in A to see if pracbce ot 
the experimental group in B led it to more improvement in A ton 
had been furnished by the pretest for the control group. Thorndike 
and Woodworth (1901, esp. 558) are the first to have introduced this 
conception into the study of transfer, ‘heir use o it was via^ 
The earhest carefully designed study was by Winch ® ^ 

which just happens to have been made in the same year *“1 
(1908) published his paper on how to determine the 
differences between groups by converting the critical ratio into a t-value 
The two developments ran along neck and neck- he 
groups and the statistical techniques for quanPfying the ~n“ence you 
Luld feel for the significance of the differences hebveen such group 
(the R. A. Fisher confidences). This kind of artifact, due ‘o h®^ 
bility of showing that the differences between the pi ’ 

is not an artifact that this book considers, so we may leave to top c 
tore. Since some of the progress of science occu s by d'scoverj 
and correction of the many kinds of artifacts, it would seem that 

still has a considerable future. j- irrelevant 

(4) The fourth scientific use of to concept ° 

here. It is to reversion to to early ® i • a of behavior (Skin- 
Skinner’s use of the notion of the control and shaping f ^ ^ 

ner, 1953). It is applied to what may be » mte— 
artifacts and has such social uses as p y ^fcpif is of course, 

deviations or the eradication of ^^at is what Rousseau 

an artifact as man promotes it for his fellows. That is wna 

thought when he extolled the noble savage. 


II. THE PROBI^M of ARTIFACT 

Now let us turn more specifically W ‘ho — 
to the problem of the artifact. Most o experimental condi- 

is concerned with the constancy and speci . . type 1, the 

tions. These requirements raise questions ° most part 

discovery and specification of extraneous tables There is nlw.ays 

social in naturc-that affect the e.xpenmcntal sanables. 
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the hope that, once understood Aej can be eliminated or at least be 
made subject to correction The expenmenter s expectations and person 
aht) (Rosenthal, Chapter 6), subjects’ personality (Rosenthal and Ros- 
now. Chapter 3), their awareness of die experimenter’s intent (McGuire, 
Chapter 2), or their concern that they are being evaluated (Rosenberg, 
Chapter 7) may affect the results The degree to which such factors 
inhere in the conditions of the expenment needs to be known, so that 
the ways to avoid their influences can come under consideration (For 
an earlier discussion of these matters, see Rosenthal, 1965 ) 

Certainly one of the most scientifically important sources of error 


in expenmcntation lies in the indeterminacy of the specification of the 
variables Consider the independent variable If x, then y, and tf not-x, 
tlten not-y But what is r? Mill, the logician did not have to consider 
that To a logician, x is x, but to a scientist x is a vanable, identified 
in words which may easily mean one thing at one time and another 
at another, or different things m different laboratories or at different 
penods of history We have already seen that a single pomt touching 
the skm IS not always the same stimulus It may be felt as one, yet 
sometimes as two (a VextrfehJer) 

Tale the specific energy of nerves which was well validated by 
"'■“r! ' ■" ««=" drscredrted about seventy-five 

The eln™ ‘ “““ have come aLut? 

of the fivr, "S'” S'mulatron of the nerve of any one 

Per- 

time Sense phvsiologists had been talkiiiv fo 
It IS that the nerves conduct animal sniru® ^ 

was malmg the point that Ihp sp ^ T "^ruosu Muller 

the catemal world but only what "the”'" perceive directly 

of the file nerves bnngs "sU^ H (F “ ^hat each 

were not clearly drstm^rshed im orTSfi lUach 'T!h“fl'‘ Tf 
of nerve bangs something different to th. k t ^ ^ 
specrCcUee drlfercnt 

he d,d not quite forget rt but he did nm aL '“’'f ^ 
the nerves have another sneciBcitv that ^ was that 

The) havospcciflcit) of proicction and rt, “ what they conduct 
■tf rt the spZficrtj’ Olathe’ 

in the brain The crucial difference lies not in tU "sr^es make 

“on but m whrther conduction leads aZu f 

ccnlury to nglil this error ’ ‘ ‘“h "lore than half a 
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The Wever-Bray effect when it was first discovered furnished another 
example of the mistaken identity of a variable, this time not the i or 
IJ of the observation but a physiologically intervening vanable men 
electronic amplification came in before 1930, it became possible for tliesc 
mveshgators to put electrodes on the auditory nerve of a cat, amplify 
the voltage, speak to the cat, and hear the sounds m a loud-speaker 
over the circuit of amplification This was a dramatic finding, U'ough 
Wever and Bray reported it with great care and circumspection (Wever 
and Bray, 1930) Presently it was discovered that the amplified potentials 
were not from the impulses of the Vlllth nerve but were induced by 
the electncal events involved in the action of the organ of Cort' ^ 
cochlea (Davis and Saul, 1931, Davis, Derbishire Lime, and Saul, 1934, 
Boring, 1942, 420-423, 434-436) Both discoveries were important as 
bearing upon the electrical nature of receptor response but the hrst 
belief that the finding supported the frequency theory of hearing was 
due to an incorrect specification of the variables 

As a matter of fact the experimental vanable is ever so m^® 

complex than is ordinarily supposed, and often the dueovery o i 
IS a scientific event of considerable importance or ins ance, 
seldom sure about the true nature of a stimulus until ““I" 

plished the analysis Such was Newtons discovery o e s , ^ 

color and especially of the surpnsmgly complex s “ sveholoav 
Galileo’s discovery of the stimulus for pitch not only put 
of tone in readiness for development but also hi^de pi^si e j r — 
management of music The psychology of smell has een 
because the nature of its stimulus remained unknown ( 1 ^°" & | 

448), John Dewey in h.s famous paper on the reflex arc made the pomt 
that the stimulus cannot be presumed but has to be discovered (Deucy 
1896, 370, Bonng, 1950, 554) The true stimulus is often —nt 
kno\vn only as the result of careful research For ms 
for apparent visual size under the rule of size cons ancy ■ j^ded 
law is?he linear size of the retinal image of the 

by the distance of the object from the ’/"^^togre s 

mvanant (Bonng, 1942, 292, 1952. 144-146) A great deal of 
of science has depended upon the discovery of such invariants (Steve 

fc .• 1“ ”■ 

his but its specification as an mvanant is jus . i j-jcnnimnts 

f^ishcs us good examples of independent van- 

^Vl^cn research on 11)^700515 is being unde . . \ amble the 

able IS the expenmllr’s 
subjects behavior, but it is a mistake to 



s 


ED^VIV G BORING 


words arc all of the suggestion "Bring me that rattlesnake,” says the 
cxpenmenlcr of a h\e coiled rattlesnake behind invisible glass, and the 
subject complies until prevented by flie glass (Rowland, 1939) Would 
he have done so had there been no glass^ Perhaps, but the “demand 
made upon him, to use Martin Ome’s term, was more than the verbal 
instniction It included the knowledge that this was an experiment, that 
)ou do not get tmly injured in an experiment, that there is an experi- 
menter and a unu ersit) looking out for you The demand characteristic 
is much broader than it is explicit Martin Ome (Chapter 5) shows 
how the cssa)s in this book deal with instructional demands upon the 
subject's behavior, instructions that are enormously amplified by special 
cues of which in many cases the subject is unaware The independent 
V anablc is insufficiently specified 


III A DILEMMA 

Now, how IS the correctness of the specification of the experimental 
variables to be protected from all these predisposing additions, conscious 
ami unconscious that the subject adds, often from his knowledge about 
the espenmental situation, to the intended explication of the mde- 
pendenU anablc The answer would seem to he in keeping the subject 

Lml en”/ t'^u f I" respect a 

irLTn “ . '‘’•■'V “"'™' because the one group 

™ .0 eommun, cation with the other, but this advantage is offset by 

W aminr W "T ° comparable^ 

betuecT™ ex™' *0 drference 

Wtuecn an experiment and real life? Not always There was H M 

bcr7am:itbl'’’"f^^ d-nmmation turTed out to 

wanted to ploase\johnmn 19ir27*'i?'|h"’“k 

also watched liis master and souchtVn horse. Clever Hans, 

It would he better to secure I*®) 

nor lotting the subjects knosv^hat thev^ ^ laboratory 

an espenmenter Tliat limitation niav hn °r that there is 

of the histonan or the .astronomer 

VMlhout .1 control You cannot 1^11*11. method of agreement 

gw mg ,awas the artilicialits of the ' or that without 

of the olwcraatiou is knos™ to he ar“.ae°'l’ o!h' 
sure to squtave in The investiuatoni set, ’ e“ artifacts are almost 
publication of \VI,c,i Prophecy fatk inBltraro?'’'r™“°" '*'‘= 

to note wlnt hannens whnn nn i ‘ ^ fanatical group in order 


an assured 


group 1 .. 

conviction is frustrated (Fest- 
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inger, Rieclcen and Schachter. 1956), but such work is only gross prelimi- 
naiy taxonomy One would like to get more ngorous facts by the use 
of expenmentation 

The choice between laboratory control and the free uncontrollea be 
havior of natural phenomena is no new dilemma Let us for a moment 
go back seventy-five or even only fifty years to the time when introspec 
tion was the pnncipal method of the new experimental ps>chology In 
those days secrecy was the rule about the experiments Student did 
not talk about procedures and observations with one another There 
was no general discussion of work m progress except per aps at e 
intimate meetings of the httle Society of Experimental Psychologists, 
when graduate students who were subjects in an expenment 
eluded from the room when the expenment was being iscusse 
dure without knowledge was the rule, and there was in force as strong 
an ethic about discussion of current expenments as there svas ® “ a 
classified war matenal Did secrecy work? It inust have been h® P 
Nowadays, some subjects are undergraduates hire rom 
laboratory, yet gossip spreads m any student group er aps 
evidence that hypotheses mfluenced results in the o ays 
fact that introspection never settled the question o t e na 
ing — as to whether feeling is an independent qua ity or a i , 
bon and, if so, what kind, as to whether feehng can or c®™'’ 
the object of attention and how it is observed if it 
clear focus of attention There was always in this cmci , 
matter, a suspicion that laboratory atmosphere oca XP , 
enced the findmgs No one could produce proof but § 
one of the reasons why mbospection faded out or “ ° ^ 5 ^ 

least systemahe experimental introspecbon lapsed alth g 

of psychophysical judgments , The exnenmental method is 

All m all we are left Control is necessary and 

sciences pnncipal tool y = /(*) 8 . fact is at bottom 

IS used even when it is not recognized as / /xmnhasizcs this 

a difference, and the method of concomitant vana specification 

relational charactensbc about facts Nevertheless Tow exhaustne the 
of a \anable one always remains un^rtam as , 

description is Artifacts adhere implicitly to speci ‘ ‘ ^ When 

are dLvered, ingenuity may sbll be -f ^ coZ^ nnd 
they are not discovered, they may persu ^-ij established fact is at 
eventually turn out to be the reason why . remains forever 

long last disconfirmed For this disconfirmalion But 

tentative, subject always to this possible c 
that IS no new idea, is it? 
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;XTi”rJEri’:.V«’S^^^ 

made upon him, to use Martin Omcs term, «as more than the verba 
mstmction It included the knowledge that this was an experiment, that 
you do not get tmly injured m an experiment, that there is an experi- 
menter and a university looking out for you The demand characteristic 
IS much broader than it is explicit Martin Orne (Chapter 5) shows 
how the essays in this book deal with instructional demands upon tne 
subjects behavior, instructions that are enormously amplified by special 
cues of which in many cases the subject is unaware The independent 
variable is insufficiently specified 


HI A DILEMMA 

Now, how IS the correctness of the specification of the expenmental 
varnbles to be protected from all these predisposing additions, conscious 
and unconscious that the subject adds, often from his knowledge about 
the experimental situation, to the intended explication of the inde- 
pendent variable’ The answer would seem to he m keeping the subject 
Ignorant of what is going on, but that is difficult In this respect a 
group control is better than a control expenment, because the one group 
has no communication with the other, but this advantage is offset by 
the fact that one cannot be sure that the two groups are comparable 
Are animals better subjects because they do not Imow the difference 
between an expenment and real life? Not always There was H M 
Johnsons dog whose threshold for pitch discnmination turned out to 
be the same as Johnsons because the dog watched Johnson’s face and 
wanted to please (Johnson, 1913, 27-31) The skilled horse, Clever Hans, 
also watched liis master and sought to please (Pfungst, 1911, 1965) 

It would be better to secure ignorance by not working in a laboratory 
nor letting the subjects know that they are subjects or that there is 
an expenmenter That limitation may, however, put you in the position 
of the histomn or the astronomer, limited to the method of agreement 
without a control You cannot tell the subject to do this or that without 
giving away the artificiality of the situation, and, if the intended part 
of the observation is known to be atUficial, other artifacts are almost 
sure to squeeze in The invesbgators whose observation resulted in the 
pubhcation of When Prophecy Fails infiltrated a fanatical group m order 
to note what happens when an assured conviction is frustrated (Fest- 
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mger, Rieclcen and Schachter, 1956). but such work .s only gross prehmi- 
nary taxonomy. One would like to get more rigorous facts by the use 

“'STo^rv een laboratory control and the free b 

havior of natural phenomena is no new dilemma Let us f” « 
go back seventy-Bve or even only fifty years to the b™ 'vhen m^pec 
ion was the pnncipal method of the -- d d 

dure without knowledge was the rule, and abo^ 

an ethic about discussion of current experiinen helpful 

classified war matenal Did secrecy work? It T /"l^rtsKe 
Nowadays, some subjects are undergra ua “ ' p^^haps the chief 

laboratory, yet gossip sprea^ m any s ^ ^ m the 

evidence that hypotheses influenced of feel 

fact that introspection never settled ‘ 9 ^ 

ing-as to whether feeling is an ‘nd®P ^ ^ cannot become 

bon and, if so, what kind, as to whether feeUng can 

the object of attention and how it is ° ^ crucial introspecbve 

clear focus of attention There was always " "^“"othesesiinfiu- 
matter, a suspicion that laboratoiy atmosp e 

enced the findmgs No one could P/°/““ of o^onfidence-at 

W l;st“:x^ ’“P-^ 

science’s pnncipal tool tj - ft*) ouch Every fact is at bottom 

IS used even when it is not recognize vimtions emphasizes this 

a difference, and the method of “^nJoss, in the specification 

relational characteristic about fac ’ exhaustive the 

of a vanahle one “''vays remains uncermn^^^^^^^^ 

description is Artifacts adhere imp T gucumvcnt them When 
are discovered, ingenuity may still ^ century and 

they are not discovered, they may „‘ell-established fact is at 

eventually turn out to be the reason y truth remains forever 

longlas/disconfirmed XbloTe^' « 

tentative, subject always to this possible 
that IS no new idea, is it? 
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SUSPICIOUSNESS OF EXPERIMENTER’S INTENT 


William J. McGuire 

University of California, San Diego 


1. INTRODUCTION 

It IS a wise expenmenter who knows his artifact from h^ ^ jnay 
and wiser still is the researcher who realizes that o 

be tomorrow’s independent variable ? “he es^serfially relativis- 

mans artifact may be another mans mam effect T ^ r ,5 well 

bo and ambiguous criterion for calhng a vana ^ u^ritpr namely, 
illustrated by the topic on which we focus m ^„,;ever. we 

suspiciousness of the experimenters raampul transitory 

shall begm by using the case of response se 

nature of the “artifact” status Response sets focus since, 

stages by which a variable passes from total career of 

^ an older topic of study, they serve to i cmnousness’ prob 

an artifact more fully than does the more curren su p clevote 

lem After discussing the career of ‘"^"SliVof suspiciousness 

^ Section to considermp the current artifac _ ^ van- 


pic of study, they serve to iimsiraic ^ob 

an artifact more fully than does the more curren su p devote 

lem After discussing the career of ‘"^"SliVof suspiciousness 

^ section to considering the current artifac antecedent van- 

of experimenter’s intent This section xviU v ‘therefore might 

ebles that give nse to such suspiciousness and w iheoreti- 

be contammated by it A fourth section wll review ® cxpen- 

eal housings in which current research on suspi consider the 

"lenter’s intent research is embedded '™„Xd m any dis- 

etbical problem of deception which is XurnrS i"*™' 

*^sion of suspiciousness of expenmenter s P 
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Any full considention of how suspiciousness of the expenmenters 
intent opentes in psychologicil research would quickly broaden to m 
elude rubrics such as guinea pig reactions placebos, faking good, aware 
ness etc , each of which carries m its tram a long history of experimental 
in\ estigation In the mainstream of American experimental psychology, 
starting at its source back m the nineteenth century, it has been taken 
for granted that the subject should be kept unaware of the purpose 
of the expenment even when it dealt with such unemotional issues 


as visual acuity or the serial position effect It must be admitted that 
this routine secretu eness has not been universal, and some expenmenters 
ha\c c\cn used themselves as observers and subjects m the areas 
of ps)chophysics and rote memorization (Ebbinghaus immediately 
comes to mind in this regard) An informal recollection of this undis 
guiscd research in which the experimenter serves as his own subject 
inclines me to believe that the results that were obtained m this flagrant 
rnanner have replicated remarkably well under covert conditions Never 
thcless, hiding from the subject the true purpose of one’s experiment 
has bewme normative in ps)chological research Ignorance is achieved 
Cither by noninformation or by misinformation I suspect that one could 
discover such ludicrous cases as m, say, the rote memonzing area, where 
an expenmenter who was investigating the serial position effect told 
IS subiccts that he was studying the effect of knowledge of results, 
infniw^'t testing a hypothesis about knowledge of results 

The -lire t'* studying the serial position effect 

fte fe ; "Mmsk-Pinsk ,„ke fortunately reLves us from 

needed tod' ' c hbonous scholarly investigation that would be 

Sne concern 

nr«Zl„r . “ S'o^g suspicinsness regard- 

O-™ -eLlynsed poU 

the student in \hc'"in'tro I 7'**' success to interest 

subjects arc dnw-n^ m , ^ (from whom the experimental 

ps)choIog> research, the er trat ol oL™* t°' experimental 

.anas as ps>choph)anes and acrbal ^ compulsion in such 

sar) Tlic problem seems moro « might seem quite unneces 

student in the clissroom ccr. Jipathy than overcunosity If the 

rehtionship as that between rX^of beauty of such a 

position CTirve even \%hen iK* » Fomentation and shape of the senal 
■an enhancing theoret^^nr:!;:. ^ 

IS necessary to keen the samr. .nj i , V special care 

«c are looking tor when he parlicmate 1™"’ suspecting what 

I P es as a subject in the laboratory 
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and perhaps reacts so atypically that the results will not be generalizable 
to a naive population Yet I suspect it was the great felt need for using 
unsuspecting subjects that has promoted some of the practices in Ameri- 
can experimental psychology which seem a little peculiar to the layman 
For example, our predilection for using nonhuman subjects, our avoid- 
ance of research on certain humanly gnpping problems, our use of highly 
artificial laboratory situations, our avoidance of phenomenological ex- 
planatory concepts, etc, have all been partially motivated by our as- 
sumption that good research can be done onlv to the extent that the 
subject is unaware of the purpose of the investigation 
While our secrehveness might seem excessive in such traditional areas 
as rote learning, one is inclmed to taVc the need for deception more 
seriously m the case of social and personality research One might have 
had only minor worries about the generalizabihty of research regarding 
serial position curves even if our data came from subjects who suspected 
that this was indeed what we uere studying There seems more grounds 
for worry that disclosure might cause senous loss of generahzability 
in other areas such as operant \erbal conditioning and attitude change 
research In this chapter we shall concentrate on this latter Ime of re 
search, but before focussing on this question of the extent to which 
awareness (or suspiciousness) of persuasive intent distorts the results 
of an attitude change expenment, we shall in the next section outlme 
what we beheve to be the life history of an artifact in general, illustratmg 
the sequential stages through which it passes m terms of the “response 
bias’* artifact 


n. THREE STAGES IN THE LIFE OF AN ARTIFACT 

A review of the progress of psychological mterest in a wde variety 
of artifacts would, we believe, reveal a natural progression of this interest 
through the three stages of ignorance, coping, and exploitation At first, 
the researchers seem unaware of the variable producing the artifact 
and tend even to deny it when its possibility is pointed out to them 
The second stage begms as its existence and possible importance become 
undeniable In this coping phase, researchers tend to recognize and 
even overstress the artifact’s importance They give a great deal of atten- 
tion to devising procedures which will reduce its contaminating influence 
and its limiting of the generalizabihty of experimental results The third 
stage, exploitation, grows out of the considerable cogitation during the 
coping stage to understand the artifactual variable so as to eliminate 
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It from the experimental situahon In their attempt to cope, some re 
searchers almost inevitably become interested in the artifactual vanable 
in its own right It then begins to receive research attention, not as 
a contaminating factor to be eliminated, but as an mteresting indepen 
ent "ranable in its own right Hence, the variable which began by 
mis ea mg the experimenter and then, as its existence became recog 
nized, proceeded to terrorize and divert him from his mam interest 
elab ^ provohing him to new empirical research and theorebcal 


rea ® ™^P’“™sness of persuostve intent artifact has only begun to 
stao V,** Hence, in this section we will illustrate the three 

artif? f '■'t'tomitmg the career of its somewhat older sibbng m the 
and cl 1 y> ^^sponse biases The preoccupation with response sets 
of npre ®''®^oped about five years earlier than that in the awareness 
a issue, and now has reached a stage that allows 

the ''f<= cycle This brief considerabon ol 

someTers”' ‘0 g>™ present discnssw 

« a cornw^ ■»“stratc our claim tLt this three stage caiee 

“penmerner" .'T "tifacts, not pecuhar to the suspiciousness " 
n ent artifact that mainly concerns us m this chapte 

^ The Ignorance Stage 

ta essentiallv^nt^^”'^^ artifact becomes known, its baleful 

■t achieve rn„r P»H of Its life span during whj 

the develonm°”r^^c'^ essentially an anticlimax Its deletenom e 
of 1 'rr »f knowledge occurs during the lo o 

searchers It ^ P^^or to its achieving the explicit attention o 
sions from J it leads the psychologist to draw false con 

und to design new '»■ tnm his theories in mappropnale ' ) 

-““0 S Z. ■" -0^0 hkely to contuse than 

“ "f gsatifying , ° "'ho have elected an intellectual voc 1 

ftan fg,^° ‘hat once we start worrying about an art*" 

e^^""hil,t””8 'h '"h.Ie our peace may be at an end so "b" 

‘ k there seems T ‘‘o''olopment of knowledge die 

oxmence „[ ^ 'o be a considerable inertia about admitting 

a 'Oevitably mv l '’'o are all ready enough to gm” 

oncronT® poace o "Pnottmg the ongoing routine of ‘h'"® d«s 
^oluotJt to'admit that his ^ 
of initl disturhe / tends to become a question o 

that n '"0 tnlutln "Tule our superordinate goal is the „j 
e' 7“' ‘“oarers ‘“vS*™ "P ‘he subordmate goal of beb " 
"eneo, aeoms heading inexorably toward such dn«'^ 

‘hat an arWt be discovered and rednco>«"“ 
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several times before it becomes a sufficiently public scandal so ^at 
some bright young men seize upon it as a devace to pry their e ers 
out of their ruts and find a place in the sun for themselves Like much 
else that disturbs and advances a field of knowledge currently, the dis 
covery of artifacts seems to be the work of the associate professors 
Our illustrative artifact of response bias exhibits the difficultv ot the 
discovery process That it took the field so long to become interested 
in this artifact seems strange for a number of reasons In the first place. 
It IS a sufficiently obvious problem so that it occurs to the la^an 
and must have suggested itself constantly, at least ^ ^ 

those engaged m testing enterprises Moreover, so much o psyc o g 
research involves ability or personality testing that °nImost 

hie upon such an artifact were constantly available With the a most 
stilling amount of current research on response lases, i “ 
believe there was ever a time when the field was not 
them As a matter of fact, though, not even severa exp ici 
of Its existence stirred up any great amount of .(.nuics' 

the demonstration by Lorge (1937) and by Lentz of an acqu^^^^ 

cence response set m personality tests ^ ^ ^ v n941 1942 

interest uL a decade passed It is undoubtedly Cronbach ( m ^ 
1946, 1950) who deserves to be called “the father of 
Since he called attention to the role of response ^ nsycho- 

m this area has grown progressively It is perhaps ‘ ^ith 

logical reality in the hisfoo' of science that he dealt 
response biases in abilities tests, where they are artifact 

wa? in this manageable area that the po^iMe importanc^ of^the art 

was first admitted and attention paid to ^n„olitv tests where 

alarm, in connection with response biases m pe / ’ gjy 

they are somewhat more difficult to cope fZ 

he found the field little attuned to the drun j response biases 

personality tests area, we are faced ";** .hrLcussion is 

such as social desirability, while with a 1 1 i somewhat 

more confined to acquiescence or position i. ‘ of ques- 

more easily handled by mechanical j 940 s, however, even 

tions or of the ordering of responses importance of con 

tliose working in the personality area a working 

trolling for “faking good’ tendencies At • ' Hatinwav, 

With the MMPI realized this problem (i-H's. • , dcsirabilitj 

1946, Gough. 1947) This J’.n.mned) until the 

artifact, which has been maintained t oarher reports In Stem 

present, followed ten years of silence ancr le ‘ rr,sttiice 

metz (1932) and by Kelly. Miles, and y"”''" ), ],(, me isiircment 

and importance of tins response bi.is artifact in p 
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B. The Stage of Coping 

Once a field has admitted the existence of an artifact, as occurred 
m the case of response biases in the 1940s, researchers in the area 
devise methods of coping witli it so diat it does not mahe the results 
of experimentation ambiguous or less generalizable Sophistication in 
achiexing this aim tends to pass through several successive steps during 
the coping phase Three of these, rejection, correction, and prevention, 
can be illustrated m the case of response sets 
I Detection and rejection as a mode of coping The most primitive 
form of coping with an artifact is to detect m which subjects it is opera- 
ti\e beyond a certain (arbitrarily determined) amount, and then to 
reject the data from all the subjects above and accept at face value 
the data from subjects not above this threshold Behind this strategy 
there lurks the double fallacies that there is a magic sieve which can 
skirn oft the noise and leave the information behind, and that this can 
simple dichotomization StiH, it must be admitted that re- 
bv uLl “"‘1 “ft™ compromise 

some siiilrts methodological tactics If we will grant that 

tZ the r , r T"'' •<> "«= «rt'factual process than others, 

he Vat ■" tact .0 of reiecting 

et amourn and ™ arb tranly 

hv the artr, whi V,d"'T.™\'’”P'' contamination 

Mith the reiccted suhint-t"' 'uf information is thrown away 

sub, cots XVdVHhteV^^^ -tifachialness remains in thosi 
amount ‘ '“"‘''"'mation beyond the preset 

rciection mode of coplnT MMprreT^^T il" detection and 
the use of three of thesc^ catch snl ^'J'Tiishes good examples of 

scores Early m the develonmp i ‘ counts, and discrepancy 

tor)., a niimlr of 

sets The most widely used of tli ^ order to catch response 

and L scales (though additional scidVlocVch™''V'^ 

sirabihti, etc. Mere all develoned hv rt. * " ma'mgenng, social de- 
ls that anjoiic uho ansivcnr too’^many qu^honV'’" "°“™ 

sistcnt, excecdmglj rare, too good to VtT 'i" “ 
as manifesting too much response bias to f;.™V 
Tlie response count procedure is closelv al.n , .1 “ protocol 

This simply iniohcs counting up the uLh^n t 
category, for example, the number of *?” ° ''“P°'’5':s m a certain 

ftsponses, to detect noncommit- 
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ment response sets or the number of “yes” responses to check for acquies« 
cence response bias. Here again, when one detects subjects exceeding 
a certain arbitrarily set level in the use of the response category, their 
protocols are rejected. 

A third tactic employing the detection and rejection mode of coping 
is the use of discrepancy scores. In general, the several discrepancy 
approaches involve partitioning the items that measure a given variable, 
for example, the schizophrenic measure on the MMPI, into two subsets, 
one of which is made up of obvious items and the other of more subtle 
items. Subjects are then rejected as trying to conceal their symptoms 
if the discrepancy between the subtle and the obvious subscores exceed 
a certain preset amount. Alternatively, the use of simulated patterns, 
based on Ae responses of subjects who have been asked “to fake good,” 
is used to detect and reject subjects whose protocols reflect an unac- 
ceptably high need to appear healthy. 

2. The correction mode of coping. The detection and rejection proce- 
dures which we just considered had an obvious arbitrariness to them 
which made them less than ideal as methods for eliminating the effects 
of the artifact. The use of an arbitrary cutoff point leaves in a consid- 
erable amount of the artifactual variance and eliminates a fair amount 
of the variance due to the factor under investigation. Inevitably, this 
primitive stage is succeeded by a more sophisticated approach to the 
problem which we here call the “correction” procedure. The experi- 
menter using this tactic attempts to retain all of the data collected and 
adjust each person’s scale score for the amount of artifactual variance 
that contaminates his responses. A classic example of this adjustment 
procedure is given by the K scale of the M^^PI. The subscale is here 
used, not as a device for detecting and rejecting certain protocols, but 
as a suppression scale score which furnishes a correction factor for each 
persons score, hopefully tailored to his amount of artifactual variance. 

Other examples of correction procedures involve the use of control 
groups or conditions. For example, we might determine a persons feel- 
ings about a subject matter area by giving him a reasoning, retention 
or perception task involving material from that area and calculating 
how much his score is affected by motivated distortion, after correcting 
the raw score for his capacity at this type of task on neutral materials. 
These correction procedures typically involve elaborate statistical 
adjustments. 

3. Prevention modes of coping. The more we leam about the correc- 
tion modes, the more hypersensitive they seem to be to the validity 
of the scales. For example, it can be demonstrated that unless our predic- 
tor scale correlates at least .70 with the criterion, it is better to develop 
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an additional predictor scale than to deselop a suppressor scale to correct 
the ongmal one (Norman 1961) In view of the ted nm and the indiffcr 
ent success of the correction modes it is not uncommon to find that 
attempts to cope with an artifact develop from an adjustment stage 
to a prevention stage These pre\cntion ^attics involve use of one or 
another procedure that avoids the artifacts occurring or at least its 
contaminatmg our obtained scores 

In the case of response sets the prevention approach has taken several 
forms One procedure is to use counterbalanced scales such as keying 
the Items so that yes and no responses equally often indicate posses 
Sion of the trait A second procedure is the use of ipsati/ing procedures 
These sometimes take the form of a priori ipsali/ing as m the use of 
forced choice items In our opinion it is preferable that they take the 
torm of a posteriori ipsatizing as for example pattern analysis A third 
prevention approach involves util./ing experimental procedures which 
minimize the likelihood of occurrence of the irtifact For exiraple one 
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the impact of initial individual differences to a sensitization vanable 
of intrinsic interest, or we find m Chapter 6 Rosenthals account of 
how the influence of the experimenters expectations on the obtained 
results developed from being a worrisome contamination to the status 
of a research program on nonverbal communication and social influence 
The case of response bias, which we are using to illustrate this account 
of the career of an artifact, shows the typical happy ending Variables 
like social desirability (Croxvne and Marlowe, 1964) or acquiescence 
(Couch and Keniston, 1960) are now considered interesting individual 
difference charactenstics in their own right, rather than merely contami 
nants to be ehmmated from our personality scales We even find attempts 
such as that of Messick (1960) to map out personality space entirely 
in terms of what were once regarded only as biases to be eliminated 
before such an enterprise could get underway 

In the case of the suspiciousness artifact to which we devote the 
remainder of our discussion m this chapter, research interest has only 
recently entered the third phase A necessary prehmmary to the efficient 
mvestigation of an independent variable, or even of an artifact, is that 
we gam experimental control over it so that the experimenter is able 
to manipulate it In the next section we will consider a dozen or so 
procedures by which the extent of the subjects suspiciousness of the 
experimenter’s intent can be manipulated Almost all of these procedures 
were developed for other reasons than the manipulation of the subject’s 
suspiciousness, which indeed is the reason why such suspiciousness was 
initially considered an artifact As our interest in the suspiciousness van 
able enters the third phase, the availability of so many procedures for 
manipulating is quite useful Hence what is, during the second stage 
of an artifact’s career, considered its deplorable pervasiveness, becomes, 
m the third phase, a considerable convenience in its study While the 
suspiciousness problem is a pervasive one in research, to provide focus 
for our discussion, all our examples of procedures for manipulatmg this 
suspiciousness will be taken from the special area of attitude change 
and social influence 


in. ANTECEDENTS OF THE AWARENESS OF 
PERSUASIVE INTENT 

In altitude change research, the experimenter traditionally pretends 
to the subject that his research deals with another topic If he is studying 
the effect of peer pressure on conformity, he might employ visual stimuli 
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and represent his stud) as an imestigation of sensory acuity If he is 
studying the impact of persuasive messages on beliefs, the experimenter 
might say he is stud)ang reading comprehension ability and represent 
the persuasive message as the test mitcrnl It seems to be taben for 
granted that if one admitted tht persuasive intent of the communication, 
the subjects behavior could not be interpreted and generalized to the 
behavior of the naive subjects to whom our theories of persuasion are 
supposed to apply Hence, any sign that the subject is suspicious of 
the persuasive intent of the expcnmcnlcr is libelv to elicit alarm There 
is consequently cause for concern that in at least eleven lines of altitude 


change research there is reason to suspect tint the expcnmcntal manipu- 
lation m addition to (or instead of) varying whatever it is intended 
to vary, might also be affecting the subject’s suspiciousness of persuasive 
intent Any relationship which is found might be due, not to the ongi- 
nally theorized effect of the mampuhtion. but to its impact on the sub- 
jects suspicioiisness Some of these possibly artifaclual manipulations 
involve how the source is represented to the subject, others have to 
do with the contents of the pcrsu-isisc message, and still othere concern 
the eynmcnhl procedures Wc shall consider each of these classes 
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only after they have been exposed to the opposition side and are sensi 
tized to controversiahty We shall consider each of these lines of work 
as they bear on the question of suspiciousness of persuasive intent It 
should be noted that this suspiciousness is an intervening variable Hence 
to understand its operation we must answer two questions To what 
extent do these antecedent conditions actually affect suspiciousness of 
persuasive intent^ And given that suspiciousness is affected, to what 
extent is the ultimate dependent vanable of opinion change (or what 
ever) further affected? 

1 Perceived disinterestedness of the source A number of studies 
have involved varying the introductory description of the source in such 
a way that he is represented to some of the subjects as having something 
to gain from their agreement with his pomt of view, while for other 
subjects he is made to appear more dismterested in the point about 
which he is arguing It seems reasonable to assume that the former 
procedure will produce greater suspiciousness of the source’s intent to 
persuade For example, a given speech advocating more lenient treat- 
ment of juvenile delinquents is judged to be fairer and produces more 
opmion change when the speaker is identified as a judge or a member 
of the general public, than when he is identified as someone himself 
involved in juvenile offenses (Kelman and Hovland, 1953) There is, 
however, evidence to suggest that the differential persuasiveness of these 
sources is due to their status difference rather than their differential 
mtent to persuade Thus Hovland and Mandell (1952) used a speech 
favoring currency devaluation and attributed it for half the subjects, 
to an executive m an importing firm who would stand to profit financially 
from such devaluation, while for the other half of the subjects, the 
speaker was represented as a knowledgeable but disinterested academic 
economist The speech was judged considerably fairer when it came 
from this latter, disinterested source but it was equally effective in chang- 
mg opinions regardless of the source Put together, the results of the 
two experiments suggest that by proper portrayal of source disinterested- 
ness one can manipulate suspiciousness of intent to persuade but this 
differential suspiciousness does not seem to eventuate in any attitude 
change differential unless the sources status is also vaned In practice, 
the source vanables of disinterestedness, expertise, and status will often 
be contaminated and so the results of varying any one of them must 
be interpreted carefully lest the contamination of the charactenstic pro 
duce misleading results At any rate, these ‘‘disinterestedness’ monipula 
tions provide little support at present for the assumption that suspicious- 
ness of the source’s pcrsuasne mtent reduces the amount of opinion 
change he effects 
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2 Sources purported perception of his audience We might assume 
that the subject will be more suspicious of the persuasive intent of 
the source if he is made to perceive that the source knows he is listening 
than if he believes that he is overlie iring the source without tlie Litters 
knowledge Walster and Feshnger ( 1962) did indeed find that women 
are more likely to be peisuaded b) i given conversation if they think 
they are inadvertently overhearing it rather than when they feel the 
speakers are aware that they are listening though this difference was 
found only with highly involving topics Siibsefjuent work by Brock 
and Becker (1965) indicated that the greater effectiveness of overheard 
communication was even further limited, requiring that the sources 
argue both in the direction winch the audience w ants to hear and also 


on an involving issue Mills and Jellison (1967) interpreted this limita- 
tion of the difference to arguments in desirable directions as indicating 
that a source is more likely to be judged sincere when he argues m 
a direction which he knows undesirable to his audience They found 
in line with this interpretation that students are more influenced by 
a speech favoring raising truck license fees if they arc told it has 
ongmally been given to truck drivers (tor whom it vsould bo arguing 
in an undesirab e direction) than when told it has been delivered to 
railway men (who would have found its conclusion desirable) Walster, 
Amnson, and Abrahams (1966) also found that a source has more impact 
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piciousness hypothesis are the results of Invm and Brockhaus’s (1963) 
study companng the effectiveness of two speeches favontig the telephone 
company, an educational type talk, versus one more explicitly askmg 
for the subject’s approval The educational one was judged as more 
mteresting, but one more directly appealing for approval produced 
more favorableness to A T & T While this difference has been inter- 
preted as indicating that the more explicit advocacy of the company 
produces more effect, it seems to us that the conditions were such that 
the differences could ha\ e been due to more personally relevant appeals 
used in the disputatious version, or to the distracting effect of the infor- 
mation in the educational version Further evidence that overt disputa- 
tiousness and partisanship might actually enhance attitude change im- 
pact by clanfying the sources point is mdicated by a study in which 
Sears (1965) presented materia! favonng the defense or prosecution 
in a jundical proceeding and found that this material had more per- 
suasive impact when it was clearly identified as coming from a defense 
or a prosecution lawyer than when it purportedly came from a neutral 
lawyer, even though the latter was rated as more trustNvorthy 

4 Order of presentation as affecting suspiciousness The pnmacy- 
recency vanable becomes involved in the suspiciousness question since, 
as Hovland, Jams, and Kelley (1953) conjectured, the first side m debate 
would have the advantage of seeming less controversial than the second, 
particularly %vith a noncontroversial issue and in a situation not clearly 
defined m advance as a debate An audience would be more inclmed 
to interpret the first side’s presentation as a rounded vieiv of the topic, 
but when they received the second side it would be much clearer to 
them that they were now hearmg a one sided viewpoint on an issue 
where other views were quite possible In this formulation, pnmacy 
effects in persuasion are attnbuted to the subject’s greater suspiciousness 
of persuasive intent while listening to the second side Hovland (1957) 
finds some suggestive support for this notion in his impression that 
pnmacy effects are more pronounced in situations where a smgle com 
municator presents both sides than when each side is presented by a 
different communicator 

This suspiciousness hypothesis predicts mam order pnmacy effects 
and more manageably, a number of interactions between order of prcsen- 
tahon and other \anables in the communications situation as they affect 
opmion change These interaction \anables include the controversiality 
and the famihanty of the issue, the use of suspicion arousing pretests, 
etc \Vc ha\c reviewed this literature m some detail elsewhere (McGuire, 
1966, 196S) as has Lana in his cliapter in this book and elsewhere 
(Lana, 1964) In general, the e\pcnmental results seem to defy desenp 
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tion by the suspiciousness hypothesis As regards mam effect, primacy 
effects may be somewhat the more common, but recency effects are 
far from rare The interactions between the order variable and others, 
such as issue controversiahty, go in the direction opposite to that re 
quired by the suspiciousness hypothesis in some studies while confirming 
it in others Overall, the pnmacy-recency results offer httle support 
for the orthodox formulation that suspiciousness of persuasive intent 
dampens persuasive impact 

B. Suspicion-arousing Factors Having to do with Message Style 
and Content 

Above we considered waj^ m which source presentation might arouse 
suspiciousness of persuasive intent and thus purportedly affect the per- 
suasive impact of the message In this section we shall consider how 
the content and style of the message might give rise to such suspicions 
One such possible variable is whedier the conclusion is draivn explicitly 
within the message, as opposed to being left for the subjects own m- 
fernng Another content vanable which might give rise to suspicion 
IS whether the opposition arguments are completely ignored or taken 
into consideration within the persuasive message Still another possible 
message factor which might give rise to such suspicion is the extremity 
of the position which is urged Finally, we shall consider such stylistic 
characteristics as the dynamism of the delivery as it might affect sus- 
piciousness of persuasive intent As has already been seen m the case 
of source factors, the results regarding message factors which we shall 
review here give surprisingly little support to the notion that arousal 
of suspiciousness tends to reduce persuasive impact 
1 Explicitness of conclusion drawing The belief that a conclusion 
ts more pefsussii’e // the person deciii'es it for htmseiB (rather than having 
it announced to him by the source, however prestigeful) has been cur- 
rent at least since the beginning of the psychoanalytic movement and 
nondirective therapy m general Freud indicated that he abandoned 
hypnotherapy with its stress on therapist suggestion, in favor of psycho 
analysis with its stress on the patients active participation in the dis- 
covery of the bases of his problems, in part because of the incredulity 
with which many of the therapist-drawn conclusions were received by 
the patient Indeed, psychoanalytic theorists have developed an episte 
mology as well as a therapy based on the notion that its insights require 
personal experience and self analysis, rather than simply external presen 
tation, in order to obtain credence and comprehensibility There are, 
of course, other theoretical reasons for advocating that the pabent par- 
ticipate actively m the drawing of conclusions regarding the nature o 



suspiaousvESS OF experimenter’s intent 


27 


his problem Any theory of therapy which depended on such concepts 
as abreaction, emotional catharsis, rapport, transference, etc , would tend 
to encourage the patient’s active participation in the therapeutic process 
even aside from credibihty factors However, the notion that the patient 
IS more hkely to beheve the therapist’s mterpretation of his problem 
if he himself actively participates in the arrival at the conclusion, rather 
than having the conclusion presented to him passively, provides at least 
part of the motivation for urging nondirective therapy 

The empirical results give little support for this notion that a message 
IS more persuasive if it leaves the conclusion to be draivn by the subject 
The early work by Jams and King (1954, King and Jams, 1956) did 
seem to indicate that a subject was more persuaded by actively improvis 
mg a speech, rather than by passively reading or hstenmg to a com 
parable speech However, subsequent research has cast considerable 
doubt on the persuasive efficacy of active improvisation, as reviewed re 
cently by McGuire (1963) The HovJand and Mandell (1952) study 
mdicated that allowmg the subject to draw the conclusion for himself, 
far from being more efficacious, actually produced far less opinion 
change than when he had the conclusion passively presented to him 
A number of other studies have likewise failed to indicate that a message 
which allows the subject to draw the conclusion for himself, and thus 
would presumably arouse less suspiciousness of persuasive mtent, was 
more persuasive than was a more explicit conclusion d^a^vmg (eg Coop 
er and Dinennan, 1951 ) 

What we seem to have here is a situation in which any enhanced 
effectiveness due to the increased credibihty that is produced by the 
subdued, imphcit conclusioned message through its lesser arousal of sus 
piciousness, is more than cancelled by its loss of effectiveness due to 
the subject’s failure to get the pomt We have been arguing frequently 
of late that most of the difficulty in persuading the audience (both 
m laboratory expenments and in naturalistic mass media situations) 
derives from the difficulty of getting the apathetic audience to attend 
to and comprehend ^^hat ^^e are saying rather than m overcoming its 
resistance to yielding to our arguments The bamer is provided by intel 
lectual indolence, rather than by motivated resistance It seems quite 
possible that those who do m fact actuallj draw for themselves the 
conclusion of the implicit message ma) be more persuaded therebj, 
but it IS more apparent that very few do m fact avail themselves of 
the opportunit) actively to draw the conclusion or rehearse the argu 
ments (McGuire, 1964) It is also probable that there is a gradual “filter- 
ing dowai” of the persuasive impact from the explicit premises to the 
implicit conclusion with the passage of time (Cohen, 1937, Stotland, 
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Katz, and Patchen 1959, McGuire, 1960, 1968) A cognitive inertia may 
pre\ent the need for cognitive consistency from manifesting its full effect 
on remote issues immediately after the message The studies cited sug- 
gest that with the passage of time these logical ramifications are increas 
mgly discernible in the belief system, as the initial inertia is gradually 
overcome Even over time however, the impact of the implicit message 
only catches up with rather than surpasses, that of the explicit message 
2 Treatment of the opposxtton arguments We might expect that 
the treatment of the opposition’s arguments would have some influence 
on the obviousness of our intent to persuade, and thus affect the per- 
suasive efficacy of our message A message which is completely one sided, 
Ignoring the existence of opposition arguments of which the subject 
may be quite aware should seem more biased and blatantly attemptmg 
to persuade than would a message which took into account the opposi 
tion arguments by mentioning them and attempting to deal reasonably 
with them Yet the World War II studies in the Army indoctnnation 
program indicated that neither the one sided nor the “two sided ’ mes- 
sage had an overall greater persuasive impact, where the former pre- 
sented the arguments for one’s own side and ignored completely the 
opposition arguments while the latter presented one’s o^vn side but at 
least mentioned and sometimes refuted the opposition arguments (Hov- 
land, Lumsdaine and Sheffield, 1949) In fact, the latter was not even 
perceived as more fair a presentation, the impression of objectivity being, 
if anything, in the reverse direction This peculiarity may have derived 
from the peculiar condition that the ‘ two sided message ignored one 
of the most salient opposition arguments, while refuting less salient ones 
It may be that to elicit the appearance of objectivity by the mention 
of the opposition arguments, one loses more credibility than he gains 
unless he is careful to mention all of the salient counterarguments As 
far as the direct persuasive impact of refuting versus ignoring the opposi 
tion IS concerned the results seem to indicate that counterarguments 
which the subjects are likely to think of spontaneously are best refuted 
and those which would not arise spontaneously are best ignored if one 
wishes to achieve maximum persuasive impact Hence, less intelligent 
subjects and those who are closer in their initial position to the conclu 
Sion being urged tend to be more influenced by messages which ignore 
the opposition arguments, while refuting the opposition argument tends 
to be more effective with subjects of higher intelligence and those further 
in the opposition as regards their initial opinions Refuting rather than 
Ignoring, the opposition arguments does seem to be superior in develop 
mg resistance to subsequent counterattacks The superior immunizing 
efficacy of mentioning and refulmg (rather than ignoring) opposition 
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arguments has been demonstrated by Lumsdame and Jams (1953), Mc- 
Guire (1964), Tannenbaum (1966), and others It should be noted in 
the present connection, however, that the suspiciousness of persuasive 
intent mechanism does not seem to play any major part m the immuniz- 
ing efficacy of considenng the opposition arguments The evidence cur- 
rently seems to indicate that resistance conferral derives from the 
motivating threat which the mention of the opposition argument arouses 
It IS conceivable, though, that the subsequent persuasive attack is less 
effective also because the pnor mention of its arguments makes the 
subject more suspicious of its persuasive intent 
3 Extremity of message position Suspiciousness of persuasive intent 
would seem to occur with greater probability as the position espoused 
m the message became more and more extreme In so far as this sus 
piciousness factor is concerned increasing the discrepancy between posi- 
tion urged in the message and the subjects initial position should pro 
gressively reduce the persuasive impact It would be naive, however, 
to disregard the likelihood that other processes mediate the relationship 
between message discrepancy and the amount of opinion change For 
example, Anderson and Hovland (1957) postulate that a reverse relation 
ship obtains such that amount of attained opinion change is an increasing 
function of amount of change urged This position is plausible since 
when a discrepancy is quite small the amount of change produced would 
be relatively minor even if the message was completely effective, while 
with large discrepancies, even a partly effective message could produce 
a considerable absolute change These considerations have led a number 
of theonsts to posit an overall nonmonotonic relationship between 
amount of obtained change and amount of urged change (Osgood and 
Tannenbaum, 1955, Shenf and Hovland, 1961), with maximal opinion 
change occurring at intermediate discrepancies 

This theoretical formulation that, as discrepancy becomes quite large 
it produces sufficient suspiciousness of persuasive intent to overcome 
the effect postulated in the Anderson and Hovland proportional model, 

IS quite plausible but cmpincal work has sliONsm that it occurs only 
at ver)’ extreme ranges of discrepancy Over a surprisingly wide range, 
the monotonic relationship holds such that the greater the discrcpanc) 
the greater the induced change It is true though that some experi- 
menters who pcrscNcrcd to the extent of producing extremely wide dis- 
crepancies Iia\c succeeded m demonstrating a rc\crsal in cffcctucntss 
as the position urged became quite extreme (Ho\land, Ilarxc) and 
Shenf, 1957, Fisher and Lubm, 1958, ^^^Ilt^akcr, 1961, etc) As might 
be expected from those suspiciousness explanations, the tiini-dowai is 
most likely to occur witli low crcilibJc sources (Bergin, 1962, Aronson, 
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Turner, and Carlsmith, 1963) and with ambiguous issues (Insko, 
Murashima and Saiyadain, 1966), and where commitment to ones initial 
position IS high (Freedman, 1964, GreenwaJd, 1964, Miller, 1965) The 
turn down is probably least likely to occur where the subject experiences 
a great deal of evaluation apprehension (Zimbardo, 1960) which is dis- 
cussed more fully m Rosenberg’s chapter in this book 

As compared with most of the lines of research we have been consider 
ing the evidence for the straightforNvard suspiciousness hypothesis ( that 
as the position urged becomes more extreme, suspiciousness of persuasive 
intent increases and persuasiveness decreases) is fairly encouraging Still, 
it should be noted that it takes rather surprising degrees of extremity 
before any such effect is manifested 

4 Style of presentation It seems likely that suspiciousness of per 
suasive intent can be aroused, not only by the content of the message, 
but also by the style m which it is presented A dynamic style of 
presentation seems more likely to arouse such suspicion than does a 
more subdued style, more conjecturally, an elegantly worded and pre 
sented speech might seem more suspicious than an improvised informal 
style 

As regards the intensity of presentation vanable, Hovland, Lumsdaine, 
and Sheffield (1949) found no differences either m attitude change or 
in perceived intent to persuade between two forms of an argumentative 
presentation used with U S Army personnel m World War II, a dynamic 
documentary style presentation and a subdued narrator style Greater 
attention to this intensity of style variable is given by researchers m 
the speech area than in psychology Bowers (1964) has attempted to 
determine the components of judged intensity of language Both he 
(Bowers, 1963) and Carmichael and Cronkhite( 1965) have found some 
very slight tendency (not reaching conventional levels oi significant^^/ 
for the more intense speech to produce less attitude change One study 
suggests that the use of metaphors may be a special case of language 
intensity m this regard Bowers and Osborn (1966) find that highly 
metaphoncal speech, which is judged to constitute a more intense stylc^ 
produces more attitude change Possibly metaphor constitutes a special 
type of intensity in this regard because, as Aristotle and Cicero sug- 
gested, it increases the perceived intelligence of the speaker If so, the 
mechanism involved m the metaphoncal affect might be perceived source 
competence rather than perceived intent to persuade 

It seems reasonable, if not quite compelling, to assume that an m 
formal extemporaneous seeming presentation would not arouse sus- 
niciousness of mtent to persuade quite as salienlly as 
polished and organized presentation Hence, one would predict tn 
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the source employing this latter, more polished style will be perceived 
more suspiciously and will be less efficacious m produemg opimon 
change However, a number of counteractmg processes would also seem 
to be operative m connection with this variable The more polished 
style would also be hkely to produce a greater comprehension of the 
message content and would tend to raise the perceived competence 
of the speaker (Sharp and McClung, 1966) Addmgton (1965) found 
no difference m opinion change impact as a function of how many 
mispronunciations had been introduced into the speech Miller and Hew- 
giU (1964) found that other inelegancies of speech, such as pauses, 
did produce a lower perceived competence of the source but did not 
affect his perceived trustworthiness In this area of research, differences 
in the mediating processes (such as the perception of source charactens- 
bcs and message comprehension) which were produced by the stylistic 
vanables did not seem to eventuate in any impressive amount of opinion 
change differentials 

C. Experimental Setting as a Factor Arousing Suspiciousness 
In this secbon we turn from intrinsic commumcabon vanables (such 
as source and message factors) to a consideration of how extnnsic factors 
denying from the expenmental setbng might affect suspiciousness and, 
consequently, atbtude change We shall consider such vanables as the 
clanty With which the situation is depicted as a psychological expen- 
ment, the use of an atbtude pretest which might arouse suspiciousness 
that one’s persuasibility is being imesbgated, and the mtroducbon of 
explicit namings that the CApenmcnt deals with pcisuasibihty It is 
m this area that we have the most dear cut examples of how a procedure 
or vanable which initially attracts attention purely for methodological 
reasons begins to gam theoretical interest in its omi right 

1 Revealtng the experimental content It seems likely that the sub- 
ject mil become more suspicious of the pcrsuasi\e intent of the messages 
presented in attitude change research if nc rc\cal to him that he is 
taking part m an expenment McGinnics and Donclson (1962) had their 
subjects read messages to other subjects nhich advocated a negative 
attitude towards ecclesiastical matters It was revealed to half of these 
subjects that their ovvai attitudes were under investigation Tlie> found 
some slight evidence that this revelation did reduce the persuasive im 
pact of the message for initially opposed subjects but onlj in some 
subgroups On the other hand, Silverman (196S) found greater com 
pliancc with the message in situations that were clearly designated to 
the subject as psvchological cxptnmcnls, in keeping with the “demand 
character" notion considercel m more del iil m Omes chapter in this 



32 


WILLIAM J MCGUIRE 


volume This interpretation receives further support from the fact that 
this greater conformity in the revelation condition occurred to a greater 
extent with subjects who had to identify themselves and with female 
subjects 

Further evidence on this point is given by studies of the effect of 
‘debriefing” Deliberate deception m experiments gives nse to the felt 
necessity on the part of most experimenters who use deception to employ 
also a ‘ debriefing’ or “catharsis” treatment at the end of their experiment 
Dunng this final procedure, the true purposes of the expenment are 
explained to the subject, and the deceptions employed are pointed out 
td him, along with the reasons why they were employed We shall return 
to the ethical considerations in the final section of this chapter, here 
we shall focus on the theoretical aspects There has long been some 
concern that participation m deception experiments and going through 
these debriefing procedures produces suspicious, experiment wise per- 
sons who are unsuited to serve as subjects m subsequent expenmenfs 
because this acquired sophistication will cause them to behave m a 
way unrepresentative of the more naive population to whom the results 
are to be generalized It does seem plausible that the revelation during 
prior debriefing about the deception used m the earlier experiment will 
make the subject suspicious about what is going on in subsequent expen 
ments and hence harder to persuade However, the results to date pro 
vide little substantiation for this reasonable concern Fillenbaum (1966) 
finds that the performance of the ‘faithful subject who has been exposed 
to prior deceptions yields results little different from those of the more 
naive subjects Indeed though both previously deceived and naive sub 
jects included a sizable number of suspicious persons, suspiciousness 
did not seem to affect their experimental performance in any important 
way Brock and Becker (1966) find tliat pnor participation m a decep 
tion expenment with debriefing produces surpnsingly little effect on 
performance in a subsequent test expenment, even when it follows im 
mediately afterwards Only when the test expenment and the pnor de 
bnefing expenment were made ostentatiously similar was performance 
found to be affected in a substantial way This work is quite reassuring 
(or disappointing, depending on ones initial attitude) regarding the 
possible contaminating effect of the subject’s suspiciousness of the expen- 
menter’s true purpose m the experiment Not only do manipulations 
which seem quite likely to arouse subjects’ suspicion fail to produce 
any noticeable change in the oblamed relationship, but even when one 
does internal analyses separately for suspicious and non suspicious sub- 
jects, the ^vo groups yield surprisingly similar relationships about tfie 
hypothesis m question 
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In my own research on attitude change, where I usually represent 
the persuasive communications as part of a test of reading comprehen- 
sion, we rather routinely mtroduce near the end of the expenment a 
questionnaire of some subtlety designed to detect any suspicions re 
gardmg the true nature of the expenment Subjects can then be parti- 
boned on the basis of their responses to high and low suspiciousness 
of persuasive intent In many experiments we have analyzed the data 
separately for the sub-group of subjects who seem to indicate at least 
a moderately good grasp of the true nature of the expenment, which 
tends to include about 15% of the total sample So far, we have never 
found significant differences between suspicious and non-suspicious 
subjects as regards the effects of any of the important variables 
Hence, we have never had to face the anguishing decision as to whether 
or not we should eliminate from our experiments a particularly sus 
picious subject, which, as we indicated in a previous section of this 
paper, is an inadequate methodological solution for the problem and 
also tends to raise more problems of generahzability than it resolves 
Judging from the uniformity of our own results, we suspect that many 
other researchers have had the same reassuring experience when they 
performed a similar internal analysis 
2 Pretests as a suspicion arouser The subject’s suspiciousness that 
we are investigating his persuasibihty in a disguised attitude change 
expenment seems more likely to arise if v\c employ a pretest than if 
we use an after-only design, particularly when the pretest involves an 
undisguised opinionnaire administered just prior to the persuasive mes- 
sages We face here a classical question of experimental design involving 
the efficiency of “before-after" \ersiis “after onl)” designs (Holland, 
Lumsdame, and Sheffield, 1949) and the inclusion of control groups 
(Solomon, 1949) The current state of (his question is reviewed m detail 
in Lana’s chapter of this v'olume on Pretest Sensitization To oversim- 
plify somewliat the conclusion to be drawn from the pretest cvpenmenla* 
tion as regards the current suspiciousness issue, it stems to us that there 
IS evidence from this work of a rather slight dejirtssing effect of using 
a pretest, as one would expect on the basis of the strughtforward sus- 
piciousness lijpothcsis that the prtliM mouse's suspicion and therefore 
reduces the amount of opinion thnige mduce-d There are. however, 
a few experiments in which a Ust atUidlv vnhmcts the main effect 
of the mimpuhbon. as one might prtdicl on the bisis of a “dcnniul 
clnractcr" mterprt tation (us distusMtl more fulU m Omes tlnpler of 
this volume) And vpiite friquuilU. the puti'sl is found to produc'c 
no mam effect at all , i * 

E^cn it «c do tiMiUlivili ooip' 
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arouse suspicion, uhich then slightly decreases the mam effect of our 
independent varnbles the methodological anguish that such a mam 
effect should provoke can be quite low We have indicated elsewhere 
(McGuire, 1966, 1968) that a senous problem of interpretation would 
occur only if we find that the pretest mteracts with our mam mdependent 
vanable Studies in which there is an interaction between this design 


feature and the independent vanable are e\ceedingly rare Hence, it 
seems likely that we will be misled, at most, by failing to detect some 
relationships because of use of a pretest, rather than being misled mto 
finding the wrong’ kind of relationship tliat would not be generahzable 
to a more naive unpretested population 
3 Effects of forewarning of persuaswe intent The most straightfor- 
ward procedure for m% estigaling the effect of suspiciousness on the 
amount of opinion change produced would seem to be designing an 
experiment with a well disguised atbtude change induction, and then 
explicitly stating the persuasive intent of the communications to half 
the subjects while the remainder of the subjects are given a quite differ- 
ent plausible explanation of the materials to be read We would then 


expect, in line with the orthodox suspiciousness notion, that the mformed 
subjects made aware of the persuasive intent of the messages would 
come forearmed and premotivated to resist the predesignated belief dis- 
crepant communications Rather slight support is given to this common 
sense notion in several experiments On the basis of internal analyses 
within some subsets of subjects Allyn and Festmger (1961) report that 
teenage subjects were more influenced by an anti dnving speech when 
they were led to believe it was being presented to them to study how 
well they could judge the personality of the speaker rather than to 
assess their opinions However, the results from the total subject sample 
did not conf^^ this orthodox prediction at the conventional level of 
significance McGuire and Fapageorgis (1962) found that a forewarning 
. n impen mg persuasive attack on certain cultural truisms accepted 
b> the subjects .nd.rectly strengthened then resrstnnee to these attLks 
-ift ^Ti Opportunity to study defensive material in 

^es t™ T'r'' '”'="”■"""8 duJ not drrectly enhance the 

e^ed H , rvhen no defens.ve Ltenal rnter- 

n s ™ r detect any resrstance 

to uggesfon produced by expheUy renundrng the subject that they 
had austvered a pretest and shonld grve srm.lar answers on the post-test 
after rece.v.ng some normative feedback Wnght (1966) Ukevvrse finds 
no sigmficmt across conditions supenonty of a direct over an indirect 
mfiuence attempt, though he finds some suggestion that when coming 
from a liked partner, the indirect message is somewhat more effective 
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Brehm’s (1966) reactance theory leads to the notion that the subject 
will tend to respond to a source’s attempt at persuasion with a boomer 
ang” response when the attempt is too blatant Perhaps the study that 
comes nearest to indicating any strong resistance conferral effect of warn 
ing IS that by Freedman and Scars (1965), and even their results seem 
somewhat dependent upon the time parameters, and the same might 
be said of the Kiesler and Kicsler ( 1964) study 


This body of research on the effects of explicit warning of persuasive 
intent has been frustratingly elusive as regards its implications There 
does seem to be a relationship begging to be found, and yet it seems 
to be hiding out in only certain cells of our expenmental design That 
it so seldom shows up as an across condition significant effect, suggests 
that, while in some cells the warning reduces the persuasive impact, 
under other conditions the warning enhances impact The source o 
such powerful interactions may be found in the demand character o 
the experiment and in the attractiveness of the source, as the researc 
which we will discuss below seems to indicate 

In general, the results of these many lines of research considered 
in this section are not particularly alarming as regards the possible arti 
factual nature of results obtained under conditions that might ma e 
the subject suspicious of the experimenters intent It might be, o course, 
that some of the experimental variables did not actually manipulate 
in any dramatic way the degree of suspiciousness However, we have 
seen a number of cases m which the independent variab e i 
to manipulate suspiciousness to a considerable extent, and sti no overa 
mam effect m terms of differential opinion change eventuated In a 
few cases, there was a diminution of communication effectiveness a er 
suspicion was aroused, in the vast number of experiments, no overa 
significant difference occurred as a function of suspiciousness, an in 
a few experiments, arousing suspiciousness actually increase t e amouri 
of change Furthermore, such effects of warning as have been found 
tend to be mam effects which are annoying rather than misleading 
Evidence for the more wornsome interaction effects are a niost nonex 
istent As regards the mam effect, where there is evidence o en ance 
resistance, it is still unclear what is the mechanism by w ic suspicious 
ness reduces attitude change Does it operate by giving t e person a 
chance to marshall his defenses, or by making it more difficult for him 
to yield to the outside influence without suffenng more loss of self esteem 
than he is willing to countenance, or by some other mec anism ^ 
more, we have been forced to suspect that under a number of conditions 
suspiciousness of intent actually enhances persuasive impact e ® 
sage Again, the mechanism question arises Such a result could be ob- 
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tamed in various ways, for example by clarifying the demand character 
of the experiment, or by the channeling of his ingratiation or cooperative 
motives, etc In the following section we shall turn to a consideration 
of what are some of the theoretical formulations that seem called for 


IV. THEORETICAL HOUSINGS OF THE 
SUSPICIOUSNESS VARIABLE 

In the previous section we looked somewhat askance at the suspicious 
ness variable, regarding it somewhat as a poor relation whose advent 
spelled trouble In this section we shall look at the suspiciousness van 
able in a more positive way, asking what interesting processes it might 
involve and what opportunities for theoretical elaboration and refine 
ment it might offer We shall first consider some matters of definition 
^to clanfy the question regarding just what the subject is supposed to 
Ibe suspicious about m order for the hypothesized effect to occur, and 
what areas of behavior suspiciousness is supposed to affect After dealing 
with the definitional problem, we shall turn to a consideration of the 
various mediating factors which seem possibly to be involved with the 
suspiciousness variable and which could result in either enhancing or 
dimmishmg the persuasive impact of experimental messages We shall 
then consider much more briefly some of the temporal considerations 
and individual difference factors that seem involved in the suspiciousness 
effects 


A. The Problem of Definition 

That in the previous section we considered as many as eleven rather 
separate lines of research purportedly giving nse to suspiciousness ot 
experimenters intent should lead us to expect that this suspiciousness 
vanable is not a completely homogeneous concept Hence, some con 
ceptual clanfication seems called for here if the results of the suspicious 
ness vanable are not to be unnecessanly confusing First we shall con 
sider the question of what the person is supposed to be suspicious abou 
in order that the predicted effect occurs We shall then point out son^ 
needed distinctions regarding the several different dependent vana 

secon of d... chap.« 

tPil out dnt the suspiciousness vanable is practically cotermin 
a"ss vtmble^ and hence anscs pcr^ncly over the nho c 
'"f p^rological research To ash about the effect of “suspicious 
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ness of expenmenter’s intent’ is to ask what is the effect of awareness 
of what IS going on in the experiment We pointed out that currently 
in psychology this awareness issue has arisen particularly is regard to 
the work on verbal conditioning and in the area of attitude change 
and that we would confine our discussion to the latter Even within 
the narrow realm of attitude change experiments, several further distinc 
tions are useful to avoid unnecessary confusions For example, in the 
experiments involving forewarning of persuasive intent, Papageorgis 
(1967) work indicates we should distinguish between situations in which 
the person is simply warned that the (unspecified) communication which 
he IS about to hear is designed to persuade him as compared with 
situations in which he is also warned regarding the precise issue and 
the side which is to be developed by the message 

Still another distinction which seems necessary to facilitate generaliza- 
tion of laboratory results to the real world is the distinction between 
being aware that the commuracator is trying to persuade oneself and 
being aware that one’s persuasibihty is being studied For example, 
the former obtains in most naturahstic situations to which we would 
want to generalize our laboratory results on attitude change, in that 
the person is at least preconsciously aware that the material with which 
he IS being presented was designed to influence his beliefs and behavior 
For example, the average audience being exposed to an advertising pre- 
sentation, a political speech, a disputation with a fnend, etc , is 
than a little suspicious that the material with which he is being presented 
is designed to influence him Hence, when in the laboratory we strain 
our intellectual and moral resources in order to design some elaborate 
deception which will hide from the subject the persuasive nature of 
the matenal, we are paradoxically making it more difficult to generalize 
to the naturahstic situation, even though the researcher frequently justi- 
fies the deception as necessary for extrapolation to the real world Are 
XNC then making a peculiar logical error in calling suspicious laboratoo' 
situations artifaclual, rather than regarding situations in vhich the sub 
ject’s suspicions are alla>cd by deceptions as the artifactual ones? 

Behind our conventional thinking m this area, there seems to he the 
assumption that it is particularly essential to prc\cnt the subjects be- 
coming aware that his persuasibi1it> is being studied, since this anarcncss 
nould seriously affect his bcliaMor and it is not operatne in the naturalis- 
tic situation Hcncc, to make Ins altitude change bclm\ior more com- 
parable belxNcen kaborator> and naturalistic setting, nc tiy b\ deception 
to cliNtrl his suspiciousness into some other channel Tins strategs 
represents a peaihar and dexioiis compromise In the natural setting, 
the person is suspicious of the persuasuc intent of the communication 
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that as being presented to him but he as not at all suspicious that his 
reactions to it arc being studied To achieve comparability in the labora- 
tory, we design the situation so that the subject suspects neither that 
the material presented was designed to persuade him nor that his per 
suasive reaction is being studied Without the deception, he would be 
suspicious both that the material was designed for persuasive purposes 
and that his own persuasibihty is being measured It can be seen that 
both the deception e'lpenment and the undisguised expenment deviate 
from the naturalistic situation in one crucial manner Perhaps some re- 
examination IS necessary as to whether the typical Uvo-fold deception 
expenment is any closer to the naturalistic situation to which we wish 
to generalize than is the somewhat more tolerable (intellectually and 
morally) fully undisguised expenment Even more clearly, the situation 
seems to call for the study of each of these dimensions of awareness 
separately and m combination, rather than choosing one or die other 
for exclusive study or, worse, confounding the two 
So far we have seen that there are several levels of awareness is 


the subject aware that the matenal was designed to persuade him, is 
he aware of the issue and side on which it will argue, and is he aware 
that his attitudinal or behavioral response to the message is being evalu 
ated^ There is an even higher level of awareness of persuasive intent, 
since we can ask further whether the subject is aware of the particular 
hypothesis being investigated For example, the expenment might be 
designed to test the hypothesis that there is a nonmonotonic, inverted V 
shaped relationship between fear arousal and the persuasive impact of 
the message The subject could be aware of all the points so far discussed 
(for example, that the expenment involves persuasion, that it deals with 
&€• sdivcsc}’ ^ serd ih&t lyhich he 

IS influenced by the matenal presented to Inm will be measured) an 
yet he might be quite unaware of the particular hypothesis about fear 
appeals Hence we could produce stiU higher degrees of awaren^s 
of the experimenter’s intent by making differentially clear to him t n 
independent vanable m the expenment, its hypothesized relationships 
to the dependent variable, and the level of the independent vana e 
to which he himself is being exposed Research results have been unclear 
about the effect of suspiciousness in general, and also about the differen 
tial effects of suspiciousness of these different aspects of the expenmen , 
two deficiencies that are probably interrelated 

2 Clanfication of ihe dependent vanahles Since suspiciousness ^ 
persuasive intent constitutes a mediating vanable m most of the 
me into which it enters, we must be concerned with ‘“"J;, 
manipulations” as well as with measuring our dependent vanable 
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if we are testing how communications with implicit versus explicit con- 
clusions affect opinion change via the mediation of suspiciousness of 
persuasive mtent, we must not only measure the dependent variable 
of opinion change but also wc should have some direct measure of 
the purported mediating suspiciousness In some of the research discussed 
m the previous section, where suspiciousness of persuasive intent was 
indeed theorized to be operative, there was such a check on the pur- 
ported mediator However, in many of the studies cited, suspiciousness 
of persuasive intent rose as a possible artifact suggested by later com- 
mentators and in these cases there usually was no such direct measure 
of this purported process mere the predicted relationship does hold 
between the antecedent manipulation and the amount of opinion change, 
but the direct measure of the suspiciousness does not show any difference 
as a function of the manipulation, the doubt is raised regarding whether 
this process does indeed enter into the relationship However, we might 
alternatively wonder if our measure of this mediator, usually a self-report 
instrument devised without too much consideration (one tends to worry 
less about constructing this incidental "check” than about measunng 
the dependent variable) is indeed adequate to pick up fluctuations in the 
suspiciousness Where this suspicion mediator is found to vary in 
the appropnate direction, the question remains whether this variation is 
adequate in amount to account for the obtained difference on the 
dent variable of opinion change A covanance analysis could test whether 
the relationship between the antecedent manipulation and opinion 
change remains significant, even when we adjust for the variance due to 
suspiciousness Here again we would probably draw any conclusions 
only tentatively, since it is unlikely that we would have any considerable 
confidence in the quantitative precision of our measuring instrument 

for suspiciousness , i j 

At least three quite different dependent vanables have been used 
in testing how suspiciousness of persuasive intent affects ® ^ 
persuasibihty In some studies (McGuire and Millman, , apa 
georgis, 1967) the dependent vanable has been the direct impact 
on opimons of the warning of persuasive intent, even e ore t e per 
suasive communications are actually presented In most studies the de- 
pendent vanable is the effect of the warning on the penuasive impact 
of the message when it is actuafly presented (Allyn an estinger, , 
Freedman and Sears, 1965) Stiff other studies (McGuire and Papa- 
georgis, 1962) have investigated the extent to which the suspicion of im- 
pending persuasive attack enhances the immunizing efficacy of a pnor 
defense presented before the forewarned attack occurs ese ree 
dependent vanables would not be expected to yield exactly the same 
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relahonships, and so ignonng Ae distinction among them is likely to 
lead to some confusion. More important, analyzed in conjunction, they 
can help considerably in clanfying the processes involved, since the 
several mechanisms associated with suspiciousness affect these different 
dependent variables in somewhat different ways, allowing us to tease out 
and evaluate the several factors involved 


B. Possible Mechanisms for Suspiciousness Effects 

Suspiciousness of persuasive intent could have an effect on the amount 
of opinion change produced via any of a number of mechanisms The 
operation of some of these would enhance tlie persuasive impact while 
others should mitigate it Still others seem able to operate in either 
direction We shall first consider three mechanisms associated with sus 


piciousness that are likely to enhance the person’s resistance to persua 
Sion One of these is that suspiaousness of impending attack should 
motivate the person to absorb and generate defensive arguments for 
his own position A second such factor is that he would, having been 
warned of an impending attack, be more hkely to rehearse actively 
his defense A third consideration is that a forewarning would constitute 
something of a challenge to his self esteem to demonstrate his ability 
to stand up for his own beliefs 

Fourth, fifth, sixth, and seventh considerations suggest that sus 
piciousness may well have the opposite effect of enhancing the per 
suasiveness of the message when it comes Assuming that the subject 
was responding to some kind of perceived demand to go along wth 
whatever the experiment entails, making him aware of its persuasive 
intent would tend to increase the amount of opinion change he wouJ 
show If he was trying to ingratiate himself for some reason with the 
source, any awareness of the persuasive intent should have a sirmla 
enhancing effect Also, since the mam obstacle to persuasive effect is 
often the subject s failure to perceive accurately the point of tlie message, 
being made aware of its purpose should enhance its impact on bun 
Finally the awareness of persuasive intent bnngs home to the person 
that there exists the source, often a person of some status who ho s 
a view opposite to his own and this would generate conformity pressures 


even pnor to the communication 

Two other mechanisms which may be involved and whose opera 
IS more ambiguous are set and distraction Either one of these cou^ 
be produced by suspiciousness of persuasive intent, and 
operate by enhancing or diminishing the persuasive „„e 

sage In the sections tint follow we shail consider each of thcs 
ossnciated mechanisms in fum 
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1 Siisptcwttsness as motivating preparatory defense McGuire and 
Papageorgis (1961, 1962) postulated that people tend to underestimate 
the vulnerability of their beliefs (at least those which they perceive 
as cultural truisms) and are little motivated spontaneously to develop 
a defense or even to absorb effectively the bolstenng arguments that 
are presented to them In a senes of studies (McGuire, 1964) it has 
been demonstrated in accord with this motivational deficit notion that 
the pnor presentation of vanous kinds of threats to the belief is eflS 
cacious in making beliefs more resistant to subsequent strong attacks 
However, the type of threat most relevant to the present discussion, 
forewarning that the forthcoming communication will constitute a per- 
suasive attack on the given belief, is efficacious in enhancing resistance 
only if presented in conjuncion with belief-bolstering matenal, indicat- 
ing that both motivation and help in developing a defense must be 
supplied The belief bolstenng matenal plus the suspicion arousing 
threat was more efficacious than the belief bolstenng matenal alone 
(McGuire and Papageorgis, 1962) 

2 Defense^ rehearsal consequent on forewarning Another possible 
source of resistance to persuasion occasioned by a suspicion of impend- 
ing attack IS that such a forewarning increases the likelihood that the 
believer will rehearse his belief defenses and thus be better prepared 
to refute the suspected attack when it comes That a rehearsal oppor- 
tunity IS important in the resistance confernng effect of promoting sus 
piciousness of impending attack is suggested by the studies varying 
the temporal interval between the threat and the actual arnval of the 
attack Freedman and Sears (1965) have shown that a forewarning of 
impending attack is more efficacious if it comes ten rather than two 
minutes pnor to the attack McGuire (1962, 1964) has demonstrated 
that pnor mention of weakened attacking arguments or the requirement 
of active participation in defending ones behefs has an accumulative 
effect over time, for a period of several da)s at least, in conferring 
resistance There is some suggestion that the rehearsal factor which 
would produce a delayed reaction resistance effect in the case of active 
participation (McGuire, 1964) occurs also as regards the persistence 
of opinion change (Watts, 1967) 

3 Suspiciousness as enhancing one's personal commitment to one 
opinion It was argued by McGuire and Millman (1965) that making 
the believer suspicious that a forthcoming communication constitutes 
an attack on his belief tends to engage his self esteem more cvplicitl) 
in his response to the communication The notion here is t lat peop c 
tend to behave so as to maintain their self esteem and that in our soeict) 
there are many situations in uliicli yaclding to a persuasiie eommumca- 
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relahonships, and so ignoring the distinction among them is likely to 
lead to some confusion More important, analyzed m conjunction, they 
can help considerably in clarifying the processes involved, since the 
several mechanisms associated with suspiciousness afFect these different 
dependent variables m somewhat different ways, allowing us to tease out 
and evaluate the several factors involved 

B. Possible Mechanisms for Suspiciousness Effects 

Suspiciousness of persuasive mtent could have an effect on the amount 
of opinion change produced via any of a number of mechamsms The 
operation of some of these would enhance the persuasive impact while 
others should mitigate it Still others seem able to operate in either 
direction We shall first consider three mechanisms associated with sus- 
piciousness that are likely to enhance the person's resistance to persua 
Sion One of these is that suspiciousness of impending attack should 
motivate the person to absorb and generate defensive arguments for 
his own position A second such factor is that he would, having been 
warned of an impending attack, be more likely to rehearse actively 
his defense A third consideration is that a forewarning would constitute 
something of a challenge to his self esteem to demonstrate his ability 
to stand up for his own beliefs 

Fourth, fifth, sixth, and seventh considerations suggest that sus 
piciousness may well have the opposite effect of enhancing the per- 
suasiveness of the message when it comes Assuming that the subject 
was responding to some kind of perceived demand to go along vvitli 
whatever the experiment entails, making him aware of its persuasive 
intent would tend to increase the amount of opinion change he would 
show If he was trying to ingratiate himself for some reason with the 
source, any awareness of the persuasive intent should have a siimhr 
enhancing effect Also, since the mam obstacle to persuasive effect is 
often the subject’s failure to perceive accurately the point of the message, 
being made aware of its purpose should enhance its impact on him 
Finally the awareness of persuasive intent bnngs home to the person 
that there exists the source, often a person of some status, who ho ^ 
a view opposite to his own and this would generate conformity pressures 
even prior to the communication 

Two other mechamsms which may be involved and whose operation 
IS more ambiguous are set and distraction Either one of these cou 
be produced by suspiciousness of persuasive intent, and either one ecu ^ 
operate by enhancing or diminishing the persuasive impact of the mes^ 
sage In the sections that follow we shall consider each of these m 
possible associated mechanisms m turn 
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1 Suspiciousness as motivating preparatory defense McGuire and 
Papageorgis (1961, 1962) postulated that people tend to underestimate 
the vulnerability of their beliefs (at least those whieh they pereeive 
as eultural truisms) and are little motivated spontaneously to develop 
a defense or even to absorb elfeetively the bolstering arguments that 
are presented to them In a senes of studies (MeGuire, 1964) it has 
been demonstrated m aeeord with this motivational delieit notion that 
the prior presentabon of vanous kinds of threats to the belief is effi- 
caeious in making behefs more resistant to subsequent strong attaeks 
However, the type of threat most relevant to the present diseussion, 
forewarmng that the fortheommg eommumeation will eonstitute a per- 
suasive attack on the given belief, is efficacious in enhancing resistance 
only if presented m conjuncion with belief bolstering matenal, indicat- 
ing that both motivation and help in developing a defense must be 
supplied The belief bolstenng material plus the suspicion arousing 
threat was more efficacious than the belief bolstenng matenal alone 


(McGuire and Papageorgis, 1962) 

2 Defense, rehearsal consequent on foreicarmng Another possible 
source of resistance to persuasion occasioned by a suspicion of impend- 
ing attack IS that such a forewarning increases the hkehhood that the 
believer will rehearse his belief defenses and t^s be better prepared 
to refute the suspected attack when it comes That a rehearsal oppor- 
tunity IS important in the resistance confemng effect of promoting sus- 
piciousness of impending attack is suggested by the studies varying 
the temporal interval behveen the threat and the actual arrival of the 
attack Freedman and Sears (1965) have shown that a forewarning of 
impending attack is more efficacious if it comes ten rather than two 
minutes prior to the attack McGuire (1962. 1964) has demonstrated 
that prior mention of weakened attacking arguments or the requirement 
of active participation in defending ones beliefs has an accumulative 
effect over time, for a period of several days at least in confemng 
resistance There is some suggestion that the rehearsal factor which 
would produce a delayed reaction resistance effect in ‘he case of active 
participation (McGuire, 1964) occurs a o as regar s e persistence 

of opinion change (Watts, 1967) ^ e 

3 Suspiciousness 

opinion It ^\as argued by McOuirc oiaking 

the believer susp.cfous that a forthcoming “‘-on constitute! 

an attack on Ins belief notion hcreTs°'’LT'’''”‘l^ 

LdVEsra: m — their self esteem an^d that m t sS'l^ 
there arc many situations in which nc mg o a p asiic commumca 
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tion would be damaging to one’s self-regard, for example when the 
issue IS a matter of taste or when the source is disreputable, or when 
one IS clearly committed publicly (or at least in one’s own mind) to 
one’s initial position Since suspiciousness that the communication is 
designed to persuade puts ones self-esteem on the line, it might be pre 
dieted that the person who is made suspicious will be resistant to the 
attack when it comes Actually, the McGuire-Millman (1965) study 
was designed to test a hypothesis about a different mode of coping 
with self esteem needs in the face of an impending persuasive attack 
They predicted (and found) that forewarned subjects actually lowered 
theur beliefs on matters of taste on which they had not exphcitly com 
mitted themselves, in advance of a suspected attack In this case the 
forewarning actually weakened the belief, our interpretation being that 
the believer spontaneously moves his behef in the direction of the un 
pending attack so that he can tell himself afterward that he felt the 
same way all the time, rather than was influenced by the persuasive 
message It should be noted, however, that while under the conditions 
of the McGuire-Millman study (suspicion of an impending attack weak- 
ening the belief) the situation could have been designed so that self- 
esteem considerations would have produced greater resistance to the 
attack 

4 Suspiciousness and message perception We have been stressing 
here and elsewhere that in most persuasion situations, m the laboratory 
and m the natural environment as well, we do not confront an audience 
attentively alert and resolute to resist our arguments, a notion that seems 
to be the point of departure for more than a httle theorizing about 
persuasibihty Rather, the audience tends to be rather apathetic with 
httle felt need to resist such arguments as get to them but not much 
inclined to pay attention to the message either According to our analysis 
of the situation, the ineffectiveness of persuasive communication more 
often derives from poor message reception than from unyieldingness 
to such part of it as is received by the audience 

Insofar as this conceptualization has general validity, awareness of 
the persuasive intent of the message would actually augment its opinion 
change impact Defining prior to message reception just what the com- 
munication IS designed to achieve in the way of attitude change coulo 
be looked upon as an introductory summary that facilitates message 
reception (Hovland, Lumsdaine, Sheffield, 1949) Theorists are becoming 
increasingly aware that persuasive communication sihiahons are lookc 
upon by the audience more as a problem to be solved than as an mtru 
Sion on their autonomy to be resisted (Bauer, 1966). 

5 Suspiciousness as clarifying demand character Since Owe m 
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Chapter 5 of this volume discusses the role of “demand character 
in determining the outcome of psychological experiments, we need dis 
cuss this matter only briefly here, as it bears on the suspiciousness of 
persuasive intent issue The usual psychological subject is a fairly co 
operative individual Sometimes he comes to our laboratory voluntarily 
(giving nse to problems that are considered more fully by Rosenthal 
and Rosnow in Chapter 3 of this volume) but even when he comes 
simply to earn a fee or to fill a course requirement, he tends to enter 
the experiment in a fairly compliant mood We would venture 
a guess that perhaps nine out of every ten subjects in psychological 
experiments would prefer to help rather than hinder the experimenter 
The experimenter, being associated with the university faculty tends to 
be a fairly benevolent and prestigeful figure to the college students who 
constitute the majority of our subjects The student population, perhaps 
even more than the general population at large, is made up of reasonable, 
well disposed individuals who value research and are disposed to 'help 
the experimenter according to their lights in the conduct of is 
experimentation , 

Hence any indication in the experimental situation which arouses e 
subject’s suspiciousness of the persuasive intent of the communica^n 
would tend to enhance the amount of opinion change produced The 
cooperative subject is hkely to assume that if the expenmenter presents 
him with a persuasive message he intends that the audience be per 
suaded by it The effect of such enhanced compliance on the part ot 
suspicious subjects responding to what they perceive as t e eman 
character of the expenment would be a mam effect, sue 1 opnion 
change would be enhanced across most expenmenbil conditions Occa 
sionally, we might use experimental conditions such that the suspicious- 
ness would lead the subject to cooperate in some way other than in 
creased compliance Such situabons are more worrisome since then the 
suspiciousness would tend to interact with our mam independent van- 
able rather than simply to add a constant to the persuasive impact 
across condifaons An earlier dishnction which we made regar ing w a 
the subject is suspicious of is relevant here If the su ject is mwe y 
suspicious that the intent of the communication is to persuade him, 
the result should be simply to add a constant to the amount of change 
produced If however he is suspicious that a certain hypothesis is being 
tested, the effect is more worrisome, since he might e comp y^S " * 
what he feels is demanded of him in a way that would make the resu 
difficult to generalize to the population at large winch is not responding 

to any such sophisticated demand chnractcr 

6 Suspiciousness and source attractiveness and power i any co 
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nsts have pointed out that in the laboratory and in the natural environ 
ment people behave in accord with an “exchange” theory such that, 
if one person conforms to the others persuasive communication, the 
other incurs an obligation to do a reciprocal favor for the first person 
An implication of this exchange theory in the present context is that 
increasing the subjects suspiciousness that the persuasive communication 
is designed to persuade him will, under specifiable conditions of source 
valence, increase rather than diminish its persuasive impact Two rele 
vant lines of current work come to mind in this connection The ingratia 
tion work by Jones ( 1964 ) would indicate that when an inferior is con 
fronted by the demands of a more powerful source (a set of conditions 
frequently operative in the laboratory as well as m natural persuasion 
situations) he can by judicious compliance on selected issues build up 
“credit’ with the power figure which could serve him well later Hence, 
where the subject is inclined to use conformity as an ingratiation tactic 
arousing his suspiciousness of persuasive intent will only increase his 
attitude change 

While history may have demonstrated that for controlling the minds 
and behavior of man, it is better to be feared than loved, love also 
wists the way to the hearts and minds of men Mills and Aronson (1965) 
have deomonstrated that a communicator who makes clear his desire 
to influence the subject’s opinion is more persausive than one who does 
not so arouse suspiciousness of persuasive intent, but only when thi® 
source is attractive Where the communicator was unattractive, sus 
piciousness had little effect on the amount of change produced In a 
subsequent study Mills (1967) finds that suspiciousness of persuasive 
intent enhances the opimon change impact with an attractive source 
and diminishes it ivith an unattractive source The psychodynamics lu 
vofved here indicate again that It IS naive to assume that suspicrucri'i’W-'^'’ 
will routinely result in diminished effectiveness , 

7 Suspiciousness and the communication of consensus McGuire an 
Millman (1965) explained the anticipatory belief lowering effect o an 
announcement of an impending attack on one's belief as due to a se 
esteem preserving tactic Specifically, one moved one’s belief in the direc 
tion of the suspected influence prior to the communication so thit one 
would not have to admit to havmg been influenced by it This anticipa 
tory belief lowering following the announcement of an impending p®^ 
suasive attack has been replicated m other laboratories Howev > 
Papageorgis (1967) has demonstrated that this self-esteem 
may be superfluous He has demonstrated that the “anticipatory 
lowering occurs after simply announcing that the other person 
the divergent belief, even when there is no implication that he is « 
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to present the subject with a persuasive communication Whether there 
IS an additional impact via the self esteem mechamsm when the subject 
is also told that this other person is about to present him with a per 
suasive communication remains to be tested Some suggestion that the 
self esteem explanation may also be responsible for the effect is gi\ en 
by the interaction with type of issue which was obtained m McGuire 
and MiUman (1965) 

8 Suspicion as establishing einstellung Arousing the persons suspi 
cion of the persuasive nature of an impending commumcation should 
induce m him a preparatory set that would influence the way in which 
he perceives the message when it comes and hence its impact on his 
belief system Klsewhere (McGuire, 1966), we have considered the evi 
dence for and against this einstellung hypothesis that has been con 
tnbuted by research on the pnmacy recency issue in attitude change 
research The implication of this formulation in the present mstance 
IS rather ambiguous Given that suspiciousness of persuasive intent estab 
hshes an expectation as to what the content of the message will be, 
it IS hard to predict whether this preparatory set will result in assimila 
tion or contrast m the person’s perception of the content when it actually 
comes The Shenf-Hovland (Shenf and Hovland 1961 Shenf. Shenf 
and Nebergall, 1965) formulation would suggest that assimilation tends 
to occur (along with increased opinion change) when the message is 
close to the subject's own position, while when the message is more 
discrepant, the contrast effect (and lessened opinion change) results The 
appropriate prediction is even more difficult to make in the present 
case since we are dealing not >vith the subjects owm posibon as Ae 
reference point, but with his suspicion aroused opimon of where the 
message will be There is some weak evidence (Ewing 1942) that a 
subject who suspects that he is about to hear a quite discrepant com 
municahon wnuld tend to perceive tlic given mesage as more discrepant 
from his owm position than actually it was On the other hand there 
seems to be an overall tendency in human perception to distort mforma 
bon toward, rather than aw a) from one’s owm position as a secular 
trend which is imposed across the operation of other distortion tendencies 
9 Suspiciousness and distraction AlKm and Ecslmgcr (19 ) manip 

uhled suspiciousness of persuasive intent by disguising the commiinica- 
bon as a test of the subjects abililj to judge the speakers pcrsonalitv 
m one condition, while m the other condibon its pcrsausivc 
revealed Subscqucnllv, Ecslmgcr and Maccobv (1964) arpicd that the 
cnicial factor here was not the suspiciousness aroused b\ the revelation 
of persuasive intent but raUicr the distraction produced hv tlie personal 
judgment task (Since the original effect wns quite slight bv convin 
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tional statistical standards, a wise commentator has aptly wntten of 
It with amused patience that, "seldom has so slight an effect been made 
to beir so heavy a burden of explanation”) In the later study, es mpr 
and Maccoby (1964) report that when their audience was 
from the persuasive sound track by an irrelevant amusing film, it showed 
mote attitude change than when they watched a film appropna e o 
the sound track Freedman and Sears (1965), however, find itt e ew 
dence for the distraction effect over and above the effect of warning 
McGuire (1968) has con|ecturcd that such effect of the film as m y 
have obtained in the Festinger and Maccoby study was per aps p 
duced by the pleasant hedonic feeling resulting from its “'ertaming 
nature, rather than by the distraction's lowering of the audiences 
fenses, as Festinger and Maccoby conjectured That a given 5 

IS more persuasive if the audience hears it in a pleasant 1^°° ^ , 

demonstrated by Jams, Kaye, and Kirschncr (1965) and y a 
Jams (1965) j 

McGuire (1966) has argued further that it would be surpns g 
distraction did indeed enhance the persuasive impact of a 
prediction of such an enhancement rests on the notion that - 

tends to be waiting and ready to defend themselves against t e c 
onslaught unless they are distracted so that it reaches ■ ter 

defenses down As we have mentioned at several points in i P 
and elsewhere, we entertain the contrary notion that audiences 
cally apathebc, disinclined to attend to the message sufficien y ^ 
to affect them but disinclined also to resist such of its argum 
reach them Since what is in shortest supply is motivation an 
to comprehend the message content sufficiently to e ^ enhance 
It seems to us that the distraction would rather r 

its persuasive impact Perhaps some resolution of this i erenc „ j ^5 
ion IS found in the work by Rosenblatt (1966, Rosenblat 
1966) which suggests a nonmonotonic relationship between ^ 

and persuasive effectiveness, with maximum impact pveen 

moderate amount of distraction He also finds some confounding 
distraction and the subject's suspiciousness We anticipate as 

rent movement in psychology towards the concept ot tne g 
an information processing machme will probably ^ edictio" 

research tor some time to come Indeed, "'d would ventu P , alogy 

that this intormauon processing theme which is ' as op 

will cause more attention to be paid to the '^^P^^^hip of such 
nosed to the yielding mediator, in determining the relation p 
mdependent variables as suspiciousness to opinion change 



SUSPiaOUSNESS OF EXPERIMENTERS INTENT 


47 


C. Temporal Considerations Regarding Suspiciousness 
The relationship of suspiciousness of persuasive intent to amount of 
opimon change seems highly dependent on time parameters We shall 
review the results of research on the effects of varying the interval 
between the warning and the attack and also the interval between the 
attack and the measurements of opinion change effect 
1 The warning-attack interval A number of studies have indicated 
that the warning is effective only if it precedes the actual attack Thus, 
McGuire (1964) shows that a forewarning of an impending attack in- 
creases the immumzing efficacy of a prior defense if it is presented 
before the defense, but that it has httle or no efficacy if the warmng 
comes after the defense Kiesler and Kiesler (1964) report that an at- 
tribution designed to arouse suspiciousness of persuasive intent is effec 
tive in reducing the impact of the message if it is presented at the 
beginning but not if it comes at the end of that message Greenberg 
and Miller (1966) find in three rephcations that if a source is identified 
as a person of low credibility prior to the presentation of the message, 
the message has less persuasiveness than when the source is not identi 
fied, but there is no retroactive effect such that the low credibility at- 
tribution diminishes the impact when it occurs only after the message 
has been presented 

There is further evidence that the warmng must not only precede 
rather than follow the message, but that it should precede the message 
by some finite time period in order to exhibit its maximum effectiveness 
Thus, McGuire (1964) has shown that the resistance conferral produced 
by having the behever participate in a worrisome active defense shows 
up more fully against an attack which comes a week later than an 
attack which follows the defense immediately He has also shown (Mc- 
Guire, 1962) that a "refutabonal" defense which mentions some threaten 
mg counterarguments develops its immunizing efficacy increasingly for 
several days subsequent to its initial presentation Hence, resistance is 
greater to an attack that follows this threatening defense by two days 
than to one which follows it immediately Freedman and Scars (1965) 
found that a warning was more efficacious in reducing opinion change 
if it preceded the attacking message by ten, rather than two, minutes 
2 Interval between attack and measurement of effect Both the tem- 
poral effect just discussed and the one to which we turn here assume 
an inertia in Uie cognitwc apparatus, such that effects produced h} 
experimental intcrN'cntion or by persuaswc communications in the natu- 
ral environment manifest themselves onlv graduall) over lime Hence, 
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an immediate post manipuIaUon measure might indicate 
“len^trom thLe revealed by delayed — 

Above, ve conjectured that suspicion of persuasive intent does p 
strain in the person but the effect of this strain becomes 
as the person has sufficient time and ingenuity to act on the in 
motivation The same gradualism considerations have 
reciprocal effect when we consider the dampening effect of 
ness on opinion change when it is allowed time to be operative We hav^ 
in mind that the suspiciousness at the time of the messag P _ 
constitutes the discounting cue” m the Hovland J? , ,i 

tion (Hovland and Weiss, 1951, Kelman and Hovland, 1953) 1 
analysis is correct, we would expect that suspiciousness P“™" 
intent would reduce the immediate opinion change impact but 
passes allowing the association between the discounting suspiciou 
cue and the convincing message content to weaken, the u 
of the persuasive message would begin to mainfest itself ^ ® _ 

layed action effects m persuasion have been more fully discusse e 
where (McGuire, 1968) 

D Individual Differences in Suspiciousness 

It often happens in the history of the psychological research on any 
issue that after it has been studied as an across subject vanab e 
a certain period, attention is turned to individual differences in 
festabon First individual differences are investigated as they modera ^ 
the effect of the variable and then the investigation begins to focus o ^ 
interaction between the variable and the personahty or other indivi 
difference characteristics Since we started our discussion in this c ap 
with an overview of the career of an artifact through the stages of ignor- 
ance, coping and exploitation, it is only appropriate that we cone 
our discussion of the substantive issues raised by the suspiciousness o 
persuasive intent vanable with a discussion of individual differences 
the operation of this factor 

It seems rather evident that people will vary as regards the ex e 
to which our manipulation arouses their suspiciousness These xndivi 
differences involve both ability and motivational variables For examp > 
quite early in the research on overt versus covert conclusion 
in the message (which we considered above) attention was tume 
the possible role of audience intelhgence in moderating any ® 

(Cooper and Dinerman 1951, Hovland and Mandell, 1952, This 
thwaite, deHaan and Kamenetzky, 1955) with rather weak evi en^ 
for any such ability inlerachon More positive evidence m 
the suspiciousness notion was given by the World War II findings (, 
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land, Lumsdaine, and Sheffield, 1949) that a message which mentioned 
the opposition arguments was more effective than one which ignored 
them if we consider the more intelligent army personnel rather than 
the less intelligent 

Besides the question of individual differences in responsiveness to 
a deliberate manipulation of suspiciousness, there is the question of idio 
syncrasies in the spontaneous arousal of suspiciousness in experimental 
situations, a topic that has begun to receive attention from Strieker 
and his colleagues (Stneker, 1967, Stneker, Messick and Jackson, 1966) 
They find considerable situational variance in the amount of suspicious- 
ness aroused but an appreciable degree of across-situational generahty 
of suspiaousness for individuals In their studies males show more sus- 
piciousness than females and they also report some tendency for males 
to show a positive relationship between need for approval and suspi- 
ciousness This would seem to reverse the finding by Rosenthal, Kohn, 
Greenfield, and Carota (1966) that subjects sconng high on social ap 
proval show less awareness of the response-reinforcement contingencies 
in verbal conditioning situations However, the resolution may reside 
in a distinction between suspiciousness and wilhngness to report suspi- 
ciousness With the focusing of interest on individual difference charac- 
teristics as they interact with suspiciousness-arousing manipulations, the 
latter variable has achieved full status as a respectable psychological 
issue in its own nght, rather than as an artifact to be overcome The 
next step should be the demonstration that the effects produced by 
this erstwhile artifact should themselves be attnbuted to an artifact 
yet to be discovered 


V, DECEPTION AND SUSPICIOUSNESS. THE 
ETHICAL DIMENSION 

Without experimenter deception, the issue of suspiciousness uould 
never anse Hence, the methodological and theoretical problems raised 
by the suspiciousness issue imply that there is already an ethical prob- 
lem Kelman (1963, 1967) particularly has called attention to the ethical 
problems imoKcd specifically in expenmenters’ use of deception TIic 
present time seems to be one of rising ethical anxieties among man) 
involved in behavioral science research and, perhaps c\cn more, in la\ 
observers of this research Some would feel that as ims.a\ota as tlic 
use of deception is, there is c\cn greater cause for moral concern in 
other practices m behavioral science research such as inxasion of pnxac). 
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harmful manipulation etc Paradoxically deception is somehmes em 
ployed to circumvent these more senous concerns as when Milgram 
(1963 1965) deceived the subject mto thinking that he was hurting 
another person rather than allowing him actually to do so 

The argument that worse things are done in behavioral science than 
deceiving subjects offers cold comfort to the researcher suffenng mora 
qualms H'^’nce ve shall confine ourselves in the present chapter to 
discussing deception apart from other ethical issues since it is the one 
intrinsically involved m the question of suspiciousness of persuasive 
intent It seems undeniable that there is some moral cost in the use 
of deception m experimental situations Perhaps few can feel distaste 
for willful decephon more than we scientists for whom the discovery 
of truth constitutes the basic moral imperative which lies at the core 
of our vocation Most of us feel at least a slight moral revulsion aesthetic 
strain and embarrassment when deceiving a subject in order to create 
an experimental situation even when we feel that our deception is m 
the service of the discovery of a higher and more lasting truth Even 
our more crass fellow researchers who sometimes act as if they enjoy 
and relish every expenmental deception they ever practiced do show 
a sign of healthy moral unease m their sharing our compulsion to remove 
the deception by a suitable debnefing or catharsis explanation at the 
end of the experiment The almost universal use of such a postexpen 
mental revelation of the deceptions is particularly impressive evidence 
of the felt ethical concern since such a revelation introduces another 
source of artifact about which researchers worry namely the communi 
cation of the true purpose of the experiment by earlier participants 
to later ones (Zemack and Rokeach 1966) The data therefore become 
contaminated with the suspiciousness artifact which our deception was 
used to avoid Some bases for feebng this moral unease over deceiving 
ones fellow man even in an expenmental situation in the service o 
truth have been discussed more fully by Kelman (1965 1967) We 
shall simply state here our opinion diat the experimenter who denies 
he feels any moral qualms about the use of deception in experiments 
IS deceiving himself 

While we emphatically insist that the use of deception does involve 
a moral cost we equally emphatically insist that it might be necessary 
to pay this cost and continue to use deception rather than to cease 
our research We must first admit here that our notion of ethics involves 
quantitative considerations a stand which some of our more absolutistic 
fellow intellectuals might regard as vulgar We deny that a practice 
which involves a moral cost must ipso facto be avoided Admittedly with 
more resignation than enthusiasm we are willing to employ a cost utility 
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analysis in the ethical evaluation of our behavioral alternatives, as our 
economist colleagues are applying it to the problems of admmstrative 
decision making and program budgeting We are willing to admit that 
we are arguing that the signal for stopping a practice is not the discovery 
that it has a moral cost, but that it has a greater moral cost relative 
to its moral utihty than have other available courses of behavior 
It seems to us that the alternatives to using deception in our experi- 
ments are to find some way of pursuing the Ime of research without 
the use of deception or of giving up the line of research Let us consider 
each of these in turn It has been argued that certam research cannot 
be done unless the subject is deceived as regards the experimental pur- 
pose In an earlier section of this chapter we exhibited an undisguised 
description, of such an experiment to indicate the patent absurdity of 
taking the subject fully into the experimenters confidence and expecting 
to find generalizable results Some might argue that we can disgmse 
the intent of our expenment without the use of deception For example, 
we might provide no information to the subject regarding the true pur- 
pose of the expenment The use of active deception is not simply hidmg 
the true intent but providing false information so as to mislead the 
subject into suspecting another purpose This practice probably ongi- 
nated in the realization that many subjects will find it psychologically 
necessary to generate some explanation for what is involved in die ex 
penment in which they are participating If they are not deceived by 
a plausible alternative explanation provided by the experimenter, they 
will denve their own explanation (which may be correct or incorrect 
as regards the actual purpose, but in either case might equally contami- 
nate the results in unknown ways and reduce their generalizibility) 
Even if the expenmenter not only withholds explanation but explicitly 
requires the subjects not to try to figure out what the expenment is 
about, many, even with the best of good will, might be unable to restrain 
their conjectunng about its purpose Hence, simply not revealing to 
the subject the purpose of the expenment will perhaps not be as effective 
in ehcitmg generalizable results as will actively deceiving him by pre- 
senting him with a false purpose Furthermore, leaving the subject in 
Ignorance or allowing him to deceive himself as to the purpose of the 
expenment might be felt by some moralists to itself present an ethical 
problem and to skirt penlously on the fringe of active deception 
A diametncally opposed method of carrying on the research without 
the use of deception is to be blatantly outspoken about the nature of 
the expenment, as regards vanous levels discussed in a previous section, 
and enlisting the subject as an active collaborator in the investigation 
Such a procedure usually involves some form of role pin} mg, such that 
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the subject is in effect told what the experiment is about and then 
ashed to adopt the role of subject by behaving as he feels a subject 
who IS actuill) in the situation would probably behave The role playing 
procedure has been used to good effect by Rosenberg and Abelson 
(1960) and helman (1967) and the role-playing is not necessarily linked 
to a full disclosure of the experimental purpose An interesting variant 
IS the observer procedure used m Bern's (1967) radical behaviora 
technique 

In a role playing procedure, instead of deceiving the subject mto 
thinking that he is lying to a fellow student in return for a one dollar 
or a 20 dollar bnbe and then testing him to see how much he believes 
his own he, the subject can be asked to imagine that he is telling the 
he for one dollar or for 20 dollars and then asked to indicate how 
much he would probably believe the he if he had actually told it under 
the several conditions This role playing procedure is particularly attrac- 
tive in that it avoids, not only deception, but some of the other ethically 
troublesome procedures such as involving the subject in psychologically 
harmful acts Despite this moral attractiveness of the role-playing proce 
dure, and even though some of the recent studies in this line of work 
indicate that similar results are obtained from the role-players and ob- 
servers” as from actual subjects, we have bttle confidence that this role- 
playing procedure will constitute a final solution that will eliminate 
the deception problem from the psychologist's list of woes We feel 
intuitively that over the wide range of psychological problems, this “pub 
lie opinion polling” approach of having the quasi-subjects tell us how 
the experiment would probably come out had we done it will prove 
quite limited Still until its limits are explored it seems a feasible hne 
of research to pursue We expect that the success of Kelman and Bern 
in their lines of research will encourage other investigators to take up 
the exploration 

Should all attempts to arcumvent the deception problem and still 
continue the research fail, there remains the alternative of ceasing the 
research altogether \Vhile we feel that considerable effort is worthwhile 
in order to carr\ on our research without deception, we ourselves value 
our research sufficiently so that, rather than give it up altogether, we 
would think it worthwhile to pay the moral cost of deceiving subjects 
as to the nature of the expenment, provided we explain to the subject 
at the end of the expenment the vanous deceptions to which he was 
exposed and our reasons for utilizing them In general, we listen wth 
little enthusiasm to the argument that research that cannot be done 
without deception should be given up altogether The altemaUvc of 
giwng up a line of research is one that too many of our colleagues 
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have found too easy to take for us to entertain such behavior on the 
part of our still-workmg colleagues with any enthusiasm 

For psychologists who are actually engaged in research, we regard 
the solution of ceasing work the least attractive of the alternatives open 
to them On the contrary, we feel that the most besetting moral evil 
in the psychological community today is indolence Were we to list 
the moral problems of psychology, we would cite those who are doing 
experiments which involve deception far below those who are doing 
too few experiments or none at all as a source of ethical concern The 
besetting offense that we find in the psychological profession, as in so 
many other sectors of the middle class, is not malfeasance but non- 
feasance It seems to us that the angel of death is likely to come upon 
more of our colleagues m idleness than in sm We hope that methodolog- 
ical and moral concern over the problem of deception and subject’s 
suspiaousness will not be used to add to the ranks of the 
self-unemployed 
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There is a long standing fear among behavioral researchers that those 
human subjects who find their way into the role of "research subject*' 
may not be entirely representative of humans m general McNemar 
(1946, 333) put it wisely when he said, "The existing science of human 
behavior is largely the science of the behavior of sophomores *’ 
Sophomores are convenient subjects for study, and some sophomores 
are more convenient than others Sophomores enrolled in psychology 
courses, for example, get more than their fair share of opportumties 
to play the role of the research subjects whose responses provide the 
basis for formulations of the principles of human behavior There are 
now indications that these ‘psychology sophomores’* are not entirely 
representative of even sophomores in general (Hilgard, 1967), a possi- 
bility that makes McNemar s formulation sound unduly optimistic The 
existing science of human behavior may be largely the science of those 

" Preparation of this chapter, which is an extensive revision of an earlier paper 
pubhshed in Human Relations (Rosenthal, 1965), was facilitated by research grants 
GS 714, GS-1741 and GS 1733 from the Division of Social Saences of the Nahonal 
Science Foundabon We want to thank our many colleagues who helped us by 
sending us unpublished papers, unpublished data and additional information of 
vanous kinds These colleagues include Timothy Brock, Carl Edwards, John R P 
French, Donald Hajes, E R Hilgard, Thomas Hood, Gene Levitt, Perry London, 
Roberta Manner, A H Maslow, Ray Muliy, Lucille Nahemow, John Ora, Jr , David 
Poor, David Rosenhan, Dan Schubert, Duane Schultz, Peter Suedfeld, Jay Tooley, 
Allan Wicker, Abraham Wolf, and Marvin Zuckcrman 
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sophomores who both (a) enroll m psychology courses and (b) volun 
telt to participate in behavioral research The extent to which . 

comprehensive%cience of human behavior can be based upon the b^_ 
havior of such self selected and mveshgator-selected ^“biecte ^ 
pirical question of considerable importance It is a ^““"0 
received increasing attention in the last few years (ag. London 
Rosenhan, 1964, Ora, 1965, Rosenhan, 1967, Rosenthal, 1965 ) 

The problem of the volunteer subject has been of interest to ma y 
behavioral researchers, and evidence of their interest will be found in 
the pages to follow Mathematical statisticians, those good consulta 
to behavioral researchers, have also interested themselves in t e 
teer problem (eg, Cochran, Mosteller, and Tukey, 1953) 
their concern we now know a good deal about the implications or s 
tical procedures and statistical inference of having drawn a 
of volunteers (Bell, 1961) The concern with the volunteer proDie 
has had for its goal the reduction of the nonrepresentativeness o 
teer samples so that investigators may increase the generality of t ei 
research results (eg, Hyman and Sheatsley, 1954, Locke, 1954) ® 

magnitude of the problem is not trivial The potential biasing ® ^ 

of using volunteer samples has been clearly illustrated recently t on 
large university, rates of volunteering varied from 10 per cent to 
per cent Even within the same course, different recruiters visiting 
ent sections of the course obtained rates of volunteering varying rom 
50 per cent to 100 per cent (French, 1963) At another university, ra es 
of volunteering varied from 26 per cent to 74 per cent when the same 
recruiter, extending the same invitation to parbcipate in the same expen 
ment, solicited female volunteers from different floors of the same dornii 
tory (Maimer, 1967) 

Some reduction of the volunteer samplmg bias may be expected rom 
the fairly common practice of requiring psychology undergraduates 
spend a certain number of hours serving as research subjects on 
requirement gets more students into the overall samphng um, hut wi 
out makmg their partiapation m any given experiment a randomly 
mined event Students required to serve as research subjects often ^ 
a choice among alternative experiments Given such a choice, w 
hnghter (or duller) students sign up for an expenment on learning 
Will better (or more poorly) adjusted students sign up for an experime 


* Most of the interest has been centered on the selection of human su je 
which IS our concern here, but there are similar problems of the selection 
representabvenss of those animal subjects that find their way into kehaviMa 
search (eg Beach, 1950, 1960. Chnstie 1951 Kavanau 1964 1967, Ricn 
1959) 
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on personality? Will students who view their consciousness as broader 
(or narrower) sign up for an experiment that promises an encounter 
with “psychedelicacies”? We do not know the answers to these questions 
very well, nor do we know whether these possible self-selection biases 
would make any difference in the mferences we want to draw 

If the volunteer problem has been of mterest and concern in the 
past there is good evidence to suggest that it will become of even greater 
interest and concern in the future That evidence comes from the popular 
press and the technical literature and it says to us In the future you, 
as an invesbgator, may have less control than ever before over the kinds 
of human subjects who find their way into your researdi The ethical 
questions of humans’ rights to privacy and to informed consent are 
more sahent now than ever before (Bean, 1959, Clark, et al, 1967, 
Miller, 1966, Orlans, 1967, Rokeach, 1966, Ruebhausen and Bnm, 1966, 
Wicker, 1968, Wolfensberger, 1967, Wolfle, 1960) One possible outcome 
of this unprecedented soul-searching is that the social science of the 
future may, due to mternally and perhaps externally imposed constraints, 
be based upon propositions whose tenability will come only from volun- 
teer subjects who have been made fully aware of the responses of interest 
to the mvestigator However, even without this extreme consequence of 
die ethical crisis of the social sciences, we still will want to learn as 
much as we can about the external circumstances and the internal char 
actenstics that bnng any given individual into our sample of subjects 
or keep him out 

Our purpose m this chapter will be to say something of what is known 
about the act of volunteering and about the characteristics that may 
differentiate volunteers for behavioral research from nonvolunteers Sub 
sequently we shall consider the implications of what we think we know 
for the representativeness of the findings of behavioral research and for 
the possible effects on the results of experiments employing human 
subjects 


I. THE ACT OF VOLUNTEERING 

Fmdmg one’s way into the role of the subject is not a random event 
The act of volunteering seems to be as rehable a response as the response 
to many widely used tests of personality Martin and Marcuse (1958), 
employing several experimental situations, found reliabilities of the act 
of volunteering to range from 67 for a study of attitudes toward sex 
to 97 for a study of hypnosis Sudi stability in the bkelihood of volun- 
tccnng raises a question as to whether there may not also be stability 
in the attributes associated with the likelihood of voluntcenng Several 
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relatively stable attributes that show promise «« -"'"S \ P"n 

of volunteenna will be discussed later m this chapter 

we s“ousrthe less stable, more situational determinants ct 

™iris''no OTntradiction that situational deteminants 
even in view of the reliability of the act f 
of the reliability of volunteering situational determinants te 
relahvely constant from the imlial request for volunteers g ,[ 

quent request so that the role of situational determinants is ^ 

diminished Unavailable at present, but worth collecting, are 
volunteering as a simultaneous function of personal charactens i 
volunteers and situational determinants of volunteering 

A. Incentives to Volunteer 

Not surpnsmg is the fact than when potential subjects fear that they 
may be physically hurt, they are less bkely to volunteer Subjec^ 
ened with electric shocks were less willing to volunteer for 
studies involving the use of shock (St'iples and Walters, 1961) or 
surprising perhaps is the finding that an increase in the expecta lO 
of pain does not lead concomitantly to much of an increase m avoi anoe 
of partiapation In one study, for example, 78 per cent of college s u 
dents volunteered to receive very weak electric shocks, while a os 
that many (67 per cent) volunteered to receive moderate to strong 
shocks (Howe, 1960) The difference between these volunteering rates 
IS of only borderlme significance (p < 15) The motives to serve sconce 
and to trust m the wisdom and authority of the experimenter ( 
this volume), and to be favorably evaluated by the experimenter (Rosen 
berg this volume), must be strong indeed to have so many people wi mg 
to tolerate so much for so little tangible reward But perhaps m Howe 
(1960) experiment the situation was complicated by the fact that ere 
was more tangible reward than usual The rates of volunteering w m 
he obtamed may have been elevated by a $3 00 incentive that he o ere 
m return for participation The subjects who volunteered for electric 
shocks may also have been those for whom the $3 00 had more *^^^5 
value Volunteers showed a significantly greater (p =* 001) need o^ 
cash” than did nonvolunteers Need for cash, however, was determine 
after the volunteering occurred, so it is possible that the incentive 
viewed as more important by tfiose who had already committed t em^ 
selves to participate by way of justifying their commitment 
themselves 

As the intensity of the plea for participation increases, more subjec 
are likely to agree to become involved For an expenment in hypnosis, 
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adding either a lecture on hypnosis or a $35 incentive increased the 
rate of volunteenng among student nurses about equally (Levitt, Lubin, 
and Zuckerman, 1962) On intuitive grounds one can speculate that 
students should perceive $35 as more rewarding than a lecture, so possi 
bly the student nurses in this study were responding to the heightened 
intensity of the request for volunteers Perhaps the more important it 
seems to the subject that his participation is to the recuiter, the higher 
will be the rate of volunteering That certainly seems to be the case among 
respondents to a mail questionnaire who, though they did not mcrease 
their participation when personalized salutations and true signatures 
were employed by the investigator, markedly increased their participa- 
tion when special delivery letters were employed (Clausen and Ford, 
1947) Consistent results were also obtained by Rosenbaum (1956), who 
found that a great many more subjects were willing to volunteer for 
an experiment on which a doctoral dissertation hung in the balance 
than if a more desultory request was made 
Volunteering also seems to become more likely as it becomes the 
proper, normative, expected thing to do If other subjects are seen by 
the potential volunteer as likely to consent, the probabihty increases 
that the potential volunteer also will consent to participate (Bennett, 
1955, Rosenbaum, 1956, Rosenbaum and Blake, 1955) And, once the 
volunteer has consented, it may be that he would find it undesirable 
to be denied an opportunity actually to perform the expected task 
Volunteers who were given the choice of performing a task (a) that 
was more pleasant but less expected or (b) one dint was less p^eismt 
but more expected, tended relatively more often to choose the latter 
(Aronson, Carlsmith, and Darlcy, 19K3) 

Sometimes it is difficult to distinguish among appeals of increased 
intensity, appeals that gi\e the impression that volunteenng is verj much 
the expected thing to do, and appeals tliat offer almost irresislablc in- 
ducements to participation More subjects volunteer when they get to 
miss a lecture as a re\Nard, and a great man) more \oluntccr \\hcn 
the) get to miss an examination (Blake, Bcrkowitz, Bellam) and Mouton, 
1956) Being excused from an exam seems to be such a strong induce 
ment that subjects tend to \oIuntcer without exception c\cn when it 
means that they must raise their hands m class to do it Under conditions 
of less extreme mcenti\c to \olunlcer, subjects seem to prefer less public 
modes of registenng their wallingncss (Blake ct al , 1956) unless almost 
tvervone else in the group also seems willing to \oluntcer pubhcl) 
(Schichler and Hall, 1952) Bennett (1933), however, found no relation 
sliip between soUmteenng and the public versus pnvatc motles of regis- 
tering willingness to participite 
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Schachter and Hall (1952) have performed a double sewice f 
dents of the volunteer problem They not on y have 
tions under which volunteering is more hkely to occur 
hkehhoods that subjects recuited under various conditions J 

show up for the experiment to which they have verba y 
their time The results are not heartening Apparently it is J 
conditions that increase the hkelihood of a subjects volunteen g 
mcrease the likelihood that he will not show up when he is suPP 
to This should serve to emphasize that it is not enough even to learn 
who will volunteer and under what circumstances We will also n 
to learn which people show up. as our science is based 
behavior of those who do At least in the case of personality tests 
IS evidence from Levitt, Lubm, and Brady ( 1962 ) to suggest t a 
shows” (le volunteers who never show up) are psychologica y mo 
like nonvolunteers than they are like “shows” (le, volunteers w 
show up as scheduled) 


B Subject Involvement 

The proposition that subjects are more likely to volunteer the more 
they are involved or the more they have to gam finds greater suppo 
in the literature on survey research than m the literature on labora ory 
experiments Levitt, Lubm, and Zuckerman (1959), for example, loun 
no differences between volunteers and nonvolunteers for hypnosis re 
search in their attitudes towards hypnosis Attitudes of 
nurse subjects were measured by responses to the “hynotist picture 
of the TAT In contrast, Zamansky and Bnghtbill (1965) found 
male undergraduate volunteers for hypnosis research rated the concept o 
'Tiypnosis’ more favorably (p *= 05) than did nonvolunteers These same 
authors also found that subjects who were more susceptible to hypno 
phenomena tended to rate the concept of “hypnosis” more favora y 
(Bnghtbill and Zamansky, 1963, Zamansky and Bnghtbill, 1965) Su 
jects for hypnosis research therefore, may select themselves not on y 
for their view of hypnosis but also for their susceptibility to hypnosis 
Direct evidence for this possibihty has been presented by Boucher an 
Hilgard (1962) 

It seems reasonable to speculate that college students majoring m 
psychology would be more interested in behavioral research than wou 
nonpsychology majors In an experiment on sensory deprivation twice 
as many psychology majors volunteered to participate than did non psy 
chology majors (Jackson and Pollard, 1966) Among the motives given 
for volunteering, cunosity was listed by 50 per cent of the subjects, 
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financial incentive ($1 25 per hour) by 21 per cent, and being of help 
to “Science” by a surprisingly low 7 per cent TIic mam reason for 
not volunteering, given by SO per cent of those who did not volunteer 
was that tlicy had no time available 
In support of these results arc those of Rosen (1951) His request 
to undergraduates to take the Minnesota Multjphasic Personality Inven- 
tory (MMPI) met with greater success among students who were more 
favorably disposed toward psychology and behavioral research Simi- 
larly, Ora (1966) found his volunteers for psychological research to 
be significantly more interested in psychology than were his nonvolun 
teers The greater interest and involvement of volunteers as compared 
to nonvolunteers also is suggested in the work of Green (1963) He 
found that when subjects were interrupted during their task per- 
formance, nonvolunleers recalled fewer of the interrupted tasks than 
did volunteers Presumably the volunteers’ greater involvement facili 
tated their recall of the tasks that they were not able to complete 
It was noted earlier that it is in the literature on survey research 
that one finds greatest support for the involvement volunteering relation 
ship Thus, the more interested a person is in radio and television pro 
gramming, the more hkely he is to answer questions about his listening 
and viewing habits (Belson, 1960, Suchman and McCandless, 1940) 
When questions were asked m the 1930’s about the use of radio in 
the classroom, it was discovered that nonresponders tended to be those 
who did not own radios (Stanton, 1939) 

College graduates are about twice as likely to respond to a mail ques 
tionnaire as college drop outs (Pace, 1939) Shuttleworth (1940) found 
that those college graduates who responded more promptly to question 
naires had an appreciably lower rate of unemployment (0 5%) than di 
those who were slower to respond (58%) Similar results have been 
reported by Franzen and Lazarsfeld (1945), Gaudet and Wilson (1940), 
and Edgerton, Bntt, and Norman (1947). all of whom conclude that 
responders tend to be those individuals who are more interested in the 
topic under study A particularly stnkmg example of this relationship 
can be found in the research of Larson and Catton (1959) Question 
naires were sent to 700 members of a national organization Of those 
responding to the first request only 17 per cent of the respondent were 
thoroughly inactive and presumably disinterested members Of those 
members who did not reply even after three requests, about 70 per 
cent were inactive, presumably disinterested individuals 
Sometimes it is not so much a matter of the general interest of the 
individual as it is his specific attitude toward the issue under discussion 
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that determmes whether he wrll be self-selected mto the sajk 
Matthysse (1966) wrote follow-up letters to research ™bj h^ 

been Lposed to pro religious communications “e t t th 

lects who rephed to h.s letter were more often those ^ 

regarded rehgious questions as more important le, atta* g 
im^rtanee to the question of the existence of God ^ Jt 

recruiting subjects for Kinsey-type interviews found that 92 pe 
of those undergraduates who volunteered to be interviewed a 
sexual freedom for women, while only 42 per cent of t ose .g, 
not volunteer advocated such freedom Data obtained by Benson ( W®, 
suggested that when public policy is under discussion, responden s y 
be over-represented by individuals with strong feelings against e p 
posed policy— a kind of political protest vote 

Survey literature is nch with suggestions for dealing with these po 
tial sources of bias One practical suggestion offered by Clausen a 
Ford (1947) follows directly from the work on involvement 1 w 
discovered that a higher rate of response was obtained if, mstea 
one topic, a number of topics were surveyed in the same study sop 
seem to be more willing to answer a lot of questions if at least som 
of the questions are on a topic of mterest to them Another, more s an 
dard technique is the follow-up letter or follow-up phone call 
minds the subject to respond to the questionnaire However, if ° 
low-up IS perceived by the subject as a bothersome intrusion, t e , 
if he responds at all, his response may reflect an intended or uninten e 
distortion of his actual beliefs The person who has been renun 
several times to fill out the same questionnaire may not approacn 
task in the same way he would if he were asked only once iq 4 qJ 

There is some evidence from Norman (1948) and from Wallm (1 
which suggests that an increase in the potential respondents degree 
of acquaintanceship with the investigator may lead to an increase 
the likelihood of the individual's cooperation Similarly, an increase 
the perceived status of the investigator may lead to an increase m 
rate of cooperation (Norman, 1948, Poor, 1967) Increases m t e 
acquaintanceship with the investigator and in the investigator’s 
may, therefore, reduce the volunteer bias, but there is a possibility t 
one bias may simply be traded for other biases Investigators who a 
better acquainted with their subjects or who have a higher percei^® 
status may obtain data from their subjects that is different from 
obtained by investigators less well-known to their subjects or lower 
perceived status (Rosenthal, 1966) We may need to learn with w 
biases we are more willing to hve, which biases we are better - 
to assess, and which biases we are better able to control 
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C. The Phenomenology of Volunteering 

Responding to a mail questionnaire is undoubtedly different from vol- 
unteering for participation m a psychological experiment (Bell, 1961), 
yet there are hkely to be phenomenological similanbes In both cases 
the prospective data-provider, be he "subject” or "respondent,” is asked 
to make a commitment of his time for the senous purposes of the data- 
collector In both cases, too, there may be an exphcit request for candor, 
and almost certainly there ^vlll be an imphcit request for it Perhaps 
most important, m both cases the data-provider recognizes that his par- 
ticipation ivill make the data collector wiser about him without makmg 
him wiser about the data-colleclor Within the context of the psychologi- 
cal experiment, Riecken ( 1962) has referred to this as the “one sided 
distnbution of informahon ” On the basis of this uneven distnbubon 
of information the subject respondent is hkely to feel an uneven distnbu 
tion of legitimate negative evaluation ® 

From the subject’s point of view, the data collector may judge him 
to be maladjusted, stupid, unemployed, lower class or in possession of 
any one of a number of other negative characteristics The possibihty 
of being judged as any of these might be sufficient to prevent someone 
from volunteering for either surveys or experiments The data-provider, 
on the other hand, can, and often does, negatively evaluate the data-col- 
lector He can call the mvestigator, his task, or his questionnaire 
mept, stupid, banal, and irrelevant but hardly with any great feehng 
of confidence as regards the accuracy of this evaluation After all, the 
data-collector has a plan for the use of his data, and the subject or 
respondent usually does not know this plan, though he is aware that a 
plan exists He is, therefore, in a poor position to evaluate the data-coUec- 
tors performance, and he is likely to kmow it 

Riecken (1962) has postulated that one of the major aims of the 
subject IS to “put his best foot forward ” It follows that in both survey 
and experimental research, the volunteer subject may be the mdividual 
who guesses that he ^vlll be evaluated favorably Edgerton, Bntt, and 
Norman (1947) found that contest winners were more hkely than losers 
to respond helpfully to a follow-up questionnaire relevant to their 
achievement These same authors convmcmglj demonstrated the con- 
sistency of their results by summarizing work which shoued, for exam- 
ple, that (a) parents of delinquent bojs are more likely to respond 
to questionnaires about the bojs if the parents ha\e mcc things to say, 
(b) college professors uho hold minor and temporar)’ appointments 

* For a full discussion of the importance to the subject of feehng e^aluated 
l> the beha\ioral scientist who studies hun, sec Chapter 7 bj Milton Rosenberg 
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are not so likely to reply usefully to job-related questionnaires and 
(c) patrons of commercial airlines are more prompt to return question 
naires about airline usage than non patrons Locke j 

respondents more willing than divorced respondents to be 
about their marital adjustment None of these findings deny the nterest 
hypothesis advanced hy Edgerton, Bntt, and Norman C19«) imteea, 
additional evidence, some of which was cited earher, can 
interpreted as demonstrating that greater interest in a ‘°P“=, , 

a higher response rate Nevertheless, on the basis of Riecken s ( 
analysis and in light of the empirical evidence cited here, we may pos 
late that another major variable which contributes to the decision 
volunteer is the subjective probability of subsequently ^ 

evaluated by the investigator It is tnte but necessary to add that tn 
formulation requires more direct empirical test 


II. CHARACTERISTICS OF VOLUNTEERS 

We have discussed some of the less stable characteristics of the volun 
teer subject that are specifically related to the source and nature o 
the invitation to volunteer Now let us consider more stable charactens 
tics of volunteers We shall proceed attribute by attribute In 
it would have been desirable to perform such an analysis separa e^y 
for each type of subject population investigated and for each type 
expenmcnt or survey conducted However, the vanabons of outcomes 
of different studies of volunteer charactensbcs within even a given type 
of subject sample and within even a given area of research were su 
ciently great that it seemed a prematurely precise strategy, given 
state of the data 

A. Sex 

The variations in the results of studies of volunteer characteristics 
are well illustrated when the cbaraclenstic investigated is the subjec 
sex Bclson (1960), Poor (1967). and Wallin (1949) reported no sex 
differences associated with the rate of volunteenng in their survey re 
search projects, nor did Hilgard, Wcitzenhoffer, Landes, and 
(1961), Hood (1963), London (1961). and Schachter and Hall (190*1^/ 
in ihcir experimental laboratory projects However, for every study t la 
docs not find a relationship between volunteenng and sex of the ^ 
spondent, tlierc is one or more that supports such a relationship a a 
I summanzes the results of 12 such studies Eight of the studies ‘S 
covered that females volunteered more than males, while the remaining 
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TABLE I 

VOLUNTEEIUNC RaTES AmONC MaLES AND FEMALES 


Author 

Task 

Percentage 
\ olunteering 

Females Males 

Two tail 
- p of 
difference 

MOIIE 

VOLUNTEERING BY FEMALES 



Himelstein (1956) 

Psychology experiment 

65% 

43% 

02 

Newman (1956) 

Perception experiment 

60% 

39% 

02 

Newman (1956) 

Personality experiment 

59% 

45% 

25 

Ora (1966) 

Psychology experiments 

66% 

54% 

001 

Rosnow & Rosenthal (1966) 

Perception experiment 

48% 

13% 

02 

Eosnow S. Rosenthal (1967)“ 

Psvcholoev exDeriment 

27% 

10% 

005 

Schubert (1964) 

Psychology experiment 

60% 

44% 

001 

Wicker (1908) 

Questionnaire 

55% 

38% 

10 

MORE 

VOLUNTEERING BY MALES 




Howe (1960) 

Electric shock 

67% 

81% 

05 

SohulU (1967b) 

Sensory deprivation 

56% 

76% 

06 

Siegman (1956) 

Sex interview 

12% 

42% 

02 

Wilson (L Patterson (1965) 

Psychology experiment 

60% 

86% 

005 


“ Unpublished data The experiment on which these data are based is described 
later in the present chapter 


four studies found the inverse relabonship Those studies for which 
women are more likely to volunteer seem to have m common that they 
requested subjects to participate in rather standard or unspecified psy- 
chological experiments * The exceptions are the studies by Schachter 
and Hall (1952) and Wilson and Patterson (1965) The former study 
asked for volunteers for a study of interpersonal attraction and found 
Ro sex differences m rates of volunteering The latter study employed 
a vague request for volunteers to which the New Zealand male under- 
graduates responded more favorably than females 

The experiments by Hilgard et al (1961) and London (1961)' had 
requested volunteers for hypnosis and neither had found any sex differ- 
ences in volunteering London did find, however, that among those sub- 
jects who were “very eager” to participate, males predominated For 
the hypnosis situation London felt that women were less likely to show 
Such eagerness because of a greater fear of loss of control Perhaps 
being very eager to be hypnotized, willing to be electncallj shocked 

* Related to these results are those obtained by Rosen (1951) and Schubert 
(1964), both of whom found males more likely to volunteer for standard cxpenmcnls 
if they showed greater femimmty of interests 
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(Howe 1960), and sensation depnved (Schultz, 1987b), and ready to 
answer’ questions about sex behavior (Siegman, 1956) reflect the so 
what greater degree of uncom entionahty *at is more often associatea 
in our culture with males than with females “ 

If one were to attempt to summarize the findings thus far one nug 
hypothesize that in behavioral research (a) there is a likelihood o 
males volunteering more than males if the task for which par 
IS solicited IS perceived as relatively standard and (b) there is a i 
hood of males volunteering more than females if the task is 
as unusual Some modest support for this hypotliesis comes from 
research of Martin and Marcuse (1958) Volunteers were 
four experiments — one m learning a second on personality, a 
volving hypnosis, and a fourth for research on attitudes toward s 
Female volunteers were o\ erreprescnted in the first three evperimen , 
those which could be described as relatively more standard Male vo u 
teers were overrepresented in the sex study 
The ]Oint effects on volunteering rates of subjects’ sex and na 
of the task for which participation is solicited are probably comp ® 
by other variables For example, Coffin (1941) long ago cautioned a ou 
the complicating effects of the investigator’s sex, and one may won 
along with Coffin and Martin and Marcuse (1958), about the differ^n i 
effects on volunteer rates among male and female subjects of 
confronted with a male versus a female Kinsey interviewer as we 
the differential effects on eagerness to be hyponotized of being con 
fronted with i male versus a female hypnotist 

Our interest m volunteers is based on the fact that only they 
provide us with the data the nonvolunteers have refused us But no 
all volunteers, it usually turns out provide us with the data we n 
To varying degrees in different studies there will be those vo ujitee^^ 
who fail to keep their expenmental appointment These “no shows ^ 
been referred to as ‘ pseudovolunteers” by Levitt, Lubin, and / 
(1962) who showed that on a variety of personality measures pseu 
volunteers are less like volunteers, and more like the nonvolunteers w^^ 
never agreed to come m the first place Other studies also have exanun 
the charactenstics of expenmental subjects who fail to keep their ap 
pomtmenls Frey and Becker (1958) found no sex differences 
subjects who notified the inveshgator that they would be absent ver ^ 
those who did not notify him Though these results argue agams 
sex difference m pseudovolunteenng, it should be noted that the en i 

“ Consistent with this interpretation is the finding by Wolf and Weiss 
that relative to female subjects, male subjects showed the greater pre ere 
isolation experiments 
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expenmental sample was composed of extreme scorers on a test of intro- 
version-extraversion Furthermore, no companson was given of either 
group of no-shows with the parent population from which the samples 
were drawn Leipold and James (1962) also compared the characteristics 
of shows and no-shows among a random sample of introductory psy- 
chology students who had been requested to serve in an experiment 
m order to satisfy a course requirement Again, no sex differences were 
found Interestingly enough, however, about half of Frey and Becker’s 
no-shows notified the expenmenter that they would be absent while 
only one of Leipold and James’ 39 no-shows so demeaned himself Fi- 
nally, there is the more recent study by Wicker (1968), in which it 
was possible to compare the rates of pseudovolunteermg by male and 
female subjects for a questionnaire study These results also yielded 
no sex differences Hence, three studies out of three suggest that failmg 
to provide the investigator with data promised him probably is no more 
apt to be the province of males than of females 

B Birth Order 

Stemming from the work of Schachter (1959) there has been increas- 
ing interest shown in birth order as a useful independent variable in 
behavioral research (Altus, 1966, Warren, 1966) A number of studies 
have attempted to shed hght on the question of whether firstborns or 
only children are more likely than laterboms to volunteer for behavioral 
research But for all the studies conducted there are only a few that 
suggest a difference in volunteering rates among first- and laterboms 
to be significant at even the 10 level It is suggestive, however, that 
all of these studies found the firstborn to be overrepresented among 
the volunteenng subjects Capra and Dittes (1962) found that among 
their Yale University undergraduates 36 per cent of the firstborns, but 
Only 18 per cent of the laterboms, volunteered for an experiment requir- 
ing cooperation in a small group Varela (1964) found that among 
Uruguayan male and female high school students 70 per cent of the 
firstborns, but only 44 per cent of the laterboms, volunteered for a small 
group experiment similar to that of Capra and Dittes Altus (1966) 
reported that firstborn males were overrepresented relative to laterbom 
males when subjects were asked to volunteer for testing Altus obtained 
similar results when the subjects were female undergraduates, but in 
diit case the difference in volunteering rates uas not statistically signifi- 
cant Suedfeld (1964) rccmitcd subjects for an expenment m sensory 
depn\ation and found tliat 79 per cent of those who appeared^n ere 
firstborns \\hile only 21 per rent »»• laterboms Unfortiaiateh do 
not know for this sample of Jitcs x\hal ^ ‘‘C 
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who did not appear were firstborn It seems 

that the base rate for primogeniture would approa^ „ c(^nt for 

(1966) was unable to find a higher proportion than p 

of volunteering by S- -us laterboms were 

found m studies by Lubin, Brady, and Levitt < j , ,^57 

Smith, and Goffard (1966), Poor (1967). Kosnovy and ^ 

unpubhshed data), Sehultz (1967a). Ward (1964), Wilson and Patted 
(1965), and Zuckerman, Schultz, and Hopkins (1967). In ’ 

in which no p reached even the 10 level, not even the ten 

In several of the studies relahng birth order to volunteering tte 
was less on whether a subject would volunteer and mote on me 
of experiment for which he would volunteer That was the 
study by Brook and Becker (1965) who found no differences behyee 
first- and laterboms in their choices of individual or group _ u 

Studies reported by Weiss, Wolf, and Wiltsey (1963) and °y 
and Weiss (1965) suggest that preference for parhcipation in g 
experiments by firstborn versus laterbom subjects may depend , 
recruitment method When a ranking of preferences was emp 
firstborns more often volunteered for a group experiment However, w 
a simple yes*no technique was employed, firstborns volunteered re a 
less for group than for individual or isolation experiments 

Thus, most of the studies show no significant relationship e 
birth order and volunteering However, in those few studies where ^ 
IS a significant relationship, the results suggest that it is the rs 
or only child who is more hkely to volunteer This finding mig 
expected on the basis of work by Schachter ( 1959 ) suggesting 
greater sociability of the firstborn It is this variable of socia 1 1 y 
which we now turn our attention 

C. Sociability 

Using as subjects male and female college freshmen, Schubert ( 
observed that volunteers (n = 562) for a “psychological 
scored higher in sociability on the Social Participation Scale of the 
than nonvolunleers (n = 443) A similar positive relationship 
sociability and volunteenng has been reported by others Martin 
Marcuse (1957, 1958) found that female volunteers for an 
in hypnosis measured higher in sociability on the Bemreuter than e 
nonvolunteers * London, Cooper, and Johnson (1962) found a ten e 

• Though one would expect the factor of introversion extraversion to gpd 

sociabihty, and so might predict greater extraversion among volunteers. 
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for their more serious \oIuntccrs to be somcwlnt more sociable than 
tliose less serious about serving science — sociability here being defined 
by the California Psychological Inventory the 16 Pf, and \IMPI 

Thus, it v\ould appear that volunteers, especially females are higher 
in sociability than nonvoluntcers The relationship in fact, however is 
not always this simple Although Lubm, Brady, and Levitt (1962a) ob 
served that student nurses who volunteered for hypnosis scored higher 
than nonvoluntcers on a Rorschach content dependency measure (a 
finding which is consistent with those above), it also was observed that 
volunteers w'ere significantly less friendly as defined by tlie Guilford 
Zimmerman On intuitive grounds the latter finding would appear to 
be inconsistent with the simple, positive sociabihtv volunteering relation 
ship One might expect a positive relationship between sociabihty and 
dependency, but certainly not a negative relationship between sociability 
^wd fnendhness Despite the confusion, it is clear that in research on 
hypnosis, differences between volunteers and nonvolunteers are likely 
to bias results Boucher and Hilgard (1962) liave shown that subjects 
who are less willing to participate in hypnosis research are clearly 
more resistant to showing hypnotic behavior when they are conscripted 
for research 

A factor that is likely to complicate the sociability volunteering rela 
tionship IS the nature of the task for which volunteering is requested 
When Poor (1967) sohcited volunteers for a psychological experiment 
he found the volunteers to be higher in sociability than nonvolunteers 
on the California Psychological Inventory However, when the task was 
Completing a questionnaire, the return rale for the less sociable subjects 
tended to be higher than that for the more sociable (p < 25) 

If sociability can be defined on the basis of membership in a social 
fraternity, then other findings become relevant as well Reuss (1943) 
obtained higher return rates among fraternity and sorority members 
(high sociability?) than among independents (lower sociability^) How 


Marcuse found no differences in introversion extraversion between volunteers and 
nonvolunteers while Ora (1966) found volunteers and especially males to be 
^gnificantly more introverted than nonvolunteers Another surprising finding by 
Frey and Becker (1958) is relevant m so far as volunteenng may be related 
to styles of pseudovolunteering Among those subjects who failed to keep an appoint 
for an experiment in which they had previously agreed to participate those 
o notified the expenmenter that they would be unable to attend had lower 
sociabihty scores on the Guilford than those who failed to appear wthout nohfying 
the expenmenter It is difficult to explam this somewhat paradoxical finding that 
presumably less thoughtful pseudovolunleers are in fact more sociable than their 
rnore thoughtful counterparts 
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ever Abeles, Iscoe, and Brown (195i-55)— a study in 
undergraduates were invited by the president of their 
nlete questionnaires concerning the Draft, the Korean . , | jj 

Ld vocational aspirations-found that fraternity men 

underrepresented m the inihal sample of J ..^^dermg” the 

diat in a subsequent session, which followed a lette ordemg 
students to parbcipate and then a personal phone ' “Ability 
were significantly overrepresented in the volunteer samp ) 
can be defined in terms of verbosity, then another stu y “ 
vant In an experiment on social participation, Hayes, 

Lundberg (1968) noted that undergraduate volunteers were , 

tive than (nonvolunteer) conscripts The relationship 

the fact that verbosity, since it was observed after the v^untee q > 

must be considered a dependent variable Perhaps conscnp 

to moodiness and quietude One cannot be absolutely cer am 

scripts would also have been less talkative than the volunteer ] 

before the expenment began 

In some cases, characteristics of volunteers do tend to rern 
over appeals for participation in different types of tasks Ear e 
descnbed the research of Lubin, Brady, and Levitt (1962a) m ^ 
student nurses were asked to volunteer for hypnosis research 
study of student nurses, Lubin, Levitt, and Zuckerman ( 196 ) 
for the return of a mailed questionnaire Volunteers in the ^ 

study were more dependent than nonvolunteers as define 
Rorschach measure In the quesbonnairc study, those who chose 
spond were also more dependent than nonresponders despite ? 
that a different definition of dependency was employed, viz one 
on the Edwards Personal Preference Schedule Though there are a 
many equivocal results to complicate the interpretation, and p 
even some contradictory findings, at least in the bulk of studies s o 
any clear difference in sociability between volunteers and nonvo u 
it would appear that volunteers tend to be the more sociable 

D. Approval Need 

Crowne and Marlowe (1964) have elaborated the empirical and 
retical network of consequence that surrounds the construct or i pp 
motivation Using the Marlowe-Crowne (M-C) Scale as their 
of need for social approval, they have shown that high scorers are 
influenceable than low scorers in a vanety of situations Direct y 
to the present chapter is their finding that high scorers report ^ 
willingness to serve as volunteers in an excruciatingly dull tas g2) 
sistcnt with this finding is the observation of Leipold and James t 
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that determined male nonvolunteers tend to score lower than voluntf ers 
on the M-C 

Similarly, Poor (1967) found that volunteers for an experiment the 
nature of which was unspecified, scored higher in need for approval 
than nonvolunteers on the M-C Scale Poor also found that subjects 
higher m need for approval were more likely than low need approvals 
to return mailed questionnaires to the investigator For both of Poors 
samples the significance levels were unimpressive in magnitude, but 
impressive m consistency, both ps were 13 (two tail) 

C Edwards (1968) invited student nurses to volunteer for an hypnotic 
dream expenment and uncovered no difference between volunteers and 
nonvolunteers in their average M-C scores This failure to rephcate 
the findings above may have been due to differences in the type of 
subject sohcited or perhaps to the nature of the expenment Another, 
rather intriguing, finding was that the need for approval of the volunteers’ 
best fnends was significantly higher than the need for approval of the 
nonvolunteers’ best fnends Also, the students’ instructors rated the vol- 
unteers as significantly more defensive than the nonvolunteers Thus, 
at least m their choice of best fnends and in their instructors’ judgment, 
though not necessanly m their own test scores, volunteers appear to 
show a greater need for social approval 
Edwards went further in his analysis of subjects’ scores on the M-C 
Scale He found a nonhnear trend suggesting that the volunteers were 
more extreme scorers on the M-C than nonvolunteers, i e , either too 
high (which one might have expected) or too low (which one would 
not have expected) Edwards’ sample size of 37 was too small to estab- 
lish the statistical significance of the suggested curvilinear relationship 
However, Poor (1967), in both of the samples mentioned earlier, also 
found a curvilinear relationship, and in both samples the direction of 
curvilineanty was the same as in Edwards' stud) Tlic more extreme 
scorers were those more likely to volunteer In Poors smaller sam 
pie of 40 subjects who were asked to volunteer for an experiment, the 
curvilinear relationship was not significant Houever, in Poors larger 
sample of 169 subjects who were asked to return a questionnaire, the 
curvilinear relationship was significant at p< 0002 (t^otail) 

So far our definition of need for appro\al has depended hca\ily on 
the Marlowe Crowne Scale, but there is e\idcnce that other paper and- 
pcncil measures might well gi'c similar results McDa\ids (1965) re- 
search, using Ins own Social Reinforcement Scale, also indicated a posi- 
twe relationship between appro\al seeking and \oluntccnng Using still 
^noUipj- measure of need for approval (Clinstic Biidnitzkv), Hood and 
ick (1967) found their volunteers to score higher than their nonvoltin 



70 EOBEKT BOSENTHAl. AND BALPH L EOSNOW 

teers Their finding was significant for male subjects while for 
Xou Lre was^ tendency for the relationship ^etw-n 
approval and volunteering to depend on the task for which volunteenng 

" With'fte ™e exception then, of die study by C Edwards, who re 

cruited a different type of subject for a different type 

It would appe'ir that volunteers tend to be higher an 

in their need for approval However, the “osit„e 

eested in the results of Edwards and Poor implies that the pos 

Relationship may only hold for the npper range of the 

may be subjects showing medium need for approval who wi 

the least 


E. Conformity . 

It seems almost tautological to consider the relationship 
unteenng and conformity, for the act of volunteering is itse a 
of conformity to some authority’s request or invitation to par icip* 
We shall see, however, that conforming to a request to « 

by no means identical with, and often not even related to, ot er e 
tions of conformity .v.j 

Crownc and Marlowe (1964) have summarized the 
subjects higher m need for approval are more likely thiri low 
approvals to conform to the demands of an experimental task me 
an Asch type situation Since need for approval is positively re 
both to \olunteenng (at least in the upper range) and to con 
one Nsould expect a positive relationship behveen conformity m 
Asch-t)pe situation and volunteenng Foster’s (1961) findings, t 
not statistically significant, imply such a relationship among ma c 


jeets, but just the opposite among females ^vould 

If \oluntcers can be charactenzed as conforming, then one , 

expect them to be low in autonomy Using the Edwards Persona 
ence Schedule, such a finding was obtained by C Edwards t 
for his sample of student nurses, the volunteers also being jimgc 
ihcir instructors as more conforming than the nonvoluntcers Hm' 
dnmctncall) opposite results were obtained by Nc\%man (Ifi-^ / 
using the Eduards Schedule, but where the task was a perception 
ment Both male and female \olunlccrs were significant!) more au 
mous than male and female non\oluntccrs ^^^lcn an expenment m 


sonaht) was the task, no dificnmcc in autonomy was i 


scaled 


N-olunlccrs and nomoluntcers . 

Lubm, Levatt, and Zuckerman (1962) also employed the ^ 
Personal Preference Schedule, finding that student nurses who comp 
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and returned a questionnaire scored lower m autonomy (and in domi 
nance) than nonrespondents The differences were, however not judged 
statistically significant To further confound the array of results it must 
be added that Martin and Marcuse (1957) found male \olunteers for 
an hypnosis experiment to be significantly more dominant than nonvolun 
teers on the Bemreuter And, Frye and Adams (1959), also using Ed 
wards’ scales, obtamed no appreciable differences on any of the measures 
between male and female volunteers versus nonvolunteers 

There appears to be little consistency m the relationships obtained 
between conformity and voluntansm In the majority of studies no sig 
nificant relationship was obtamed between these variables However 
considenng only those studies in which a significant relationship was 
obtained, one might tentatively conclude that the direction of relafaon 
ship IS unpredictable for female subjects but that male volunteers are 
probably more autonomous than male nonvolunteers 

F Authoritarianism 

A number of investigators have compared volunteers ^vlth nonvolun 
teers on the basis of several related measures of authoritarianism Rosen 
(1951), using the F Scale definition of authontanamsm, found that vol 
unteers for personaUty research scored lower than nonvolunteers New 
^an (1956) also found volunteers to be less authontanan on the F 
Scale, but his finding was comphcated by the interacting effects of ty'pc 
of experiment and sex of subject Thus, only when recruitment was 
for an experiment in perception and only when the subjects were male 
'vas there a significant difference in authontanamsm between volunteers 
and nonvolunteers ^Vhen recruitment was for an expenment in personal 
*ty» neither male nor female volunteers showed significantly lower au 
thontanamsm than nonvoluntcers 

Poor (1967), also employing the F scale, found that mail questionnaire 
respondents were less authontanan than nonrespondents IIo\\c\cr, in 
soliciting volunteers for an expenment in social psychology, Poor ob 
tamed no differences in authontanamsm between volunteers and 

Bonvolunteers 

Martin and Marcuse (1957), in their study of volunteers for hvpnosis 
research, employed the Elhnocenlnsm (E) Scale Volunteers, cspcciallv 
males, were found to be significantly less ethnocentric llian nonvolun 
twrrs However, Schubert (196-1), emploving MMPI definitions of preju 
^icc and tolerance, obtained no difference between volunteers and non 
'oluntccrs for a psychological cxpcnmcnl 

Consistent with the general trend toward lower aulliontananism 
among volunteers are the results of Wallm (1919) He found tbit partici 
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p.,nts m survey research are pohhcally and socmlly more 
ronparue.pants’^ Benson, Booman, and Clark (1951) ^ 

more favorable attitude toward minority groups among people who were 
ivilhng to be interviewed than among those who were not “operaov 
Finally. Burchinal (1960) found that undergraduates who 
questionnaires at scheduled sessions were less authontanan tha 

students who did not , , , . i,g 

The bulk of the evidence suggests that volunteers are like y 
less authoritarian than nonvolunleers This conclusion j 

ranted for those studies in which the subjects were asked to r p 
either verbally or in wntten form to questions of a personal m 
In all five samples where the task was to answer such personal que . 
those subjects who were less authoritarian, broadly defined, were 
cooperative 


G Conventionality 

There is a sense in which the more authontanan individual is 

.1 ^\..L u* ,-,^t,mlPPrs for behavioral 

1 most 


niaruin - 

the more conventional, so that one might expect volunteers for 


research to be less conventional than nonvolunteers 
often, but by no means always, to be the case Thus, Walhn { ^ 

who found survey respondents to be less authontanan than 
dents, did not find a difference m conventionality between these ye 
Rosen (1951), however, found that volunteers for personality 
were less conventional than nonvolunteers, while C Edwards (, 
reported the opposite finding In the latter study, student nurses 
volunteered for an hypnotic dream experiment were judged y 
mstructors to be more conventional than nonvolunteers 

A number of studies have discovered that volunteers for Kms^ ^ 
interviews tend, either m their sexual behavior or in their atti 
ward sex, to be more unconventional than nonvolunteers (Mas ow, 
Maslow and Sakoda, 1952, Siegman, 1956) In order to 
whether this relative unconventionality of volunteers is specinc 
Kinsey type situation, one would need to know if these same v . 

were more likely than nonvolunteers to participate in other 
psychological research It also would be helpful if one knew 
groups matched on the basis of sexual conventionality, but i 
m other types of conventionahty, exhibited different rates of voluntee 
for Kinsey-type interviews fl-rtinK 

The Pd scale of the MMPI often is regarded clinically as re 
dissatisfaction \vith societal conventions, and higher scorers may 
garded as less conventional tiian lower scorers Both London ^ 
(1962) and Schubert (1964) found volunteers for different type 
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expenments to be less conventional by tins definition, though with their 
army servicemen subjects, Myers, Murphy, Smith, and Goffard (1966) 
found volunteers for a perceptual isolation experiment to be more con- 
ventional London et al and Schubert further found volunteers to score 
higher on the F scale of the MMPI, which reflects a wilhngness to 
admit to unconventional experiences The Lie scale of the MMPI taps 
primness and propnety, and high scorers may be regarded as more 
conventional than low scorers Although Heihzer (1960) found no Lie 
scale differences between volunteers and nonvolunteers, Schubert (1964) 
found that volunteers scored lower 

These results are not unequivocal, but in general it would appear 
that volunteers for behavioral research tend to be more unconventional 
than nonvolunteers Six studies support this conclusion (However, two 
others find no difference, and two others report the opposite relation- 
ship ) 

It would not be surprising if further research proved that sex differ- 
ences are significant determinants of the nature of the conventionahty- 
volunteenng relationship Recall that London et al (1962) concluded, 
at least for hypnosis research, that females who volunteer may be sig- 
nificantly more mterested in the novel and the unusual, whereas for 
males the relationship is less hkely A finding was noted earher, under 
the heading of conformity, that may bear out London et al That was 
Foster’s (1961) finding which, though not statistically significant, implied 
that the relationship behveen conformity and volunteering may be in 
opposite directions for males versus females 

H. Arousal Seeking 

On the basis of his results using over 1,000 subjects, Schubert (1964) 
has postulated a trait of arousal seekmg on which he found volunteers 
to differ from nonvolunteers He notes tiiat volunteers for a “psychologi- 
cal expenment” reported dnnking more coffee, taking more caffeine pills, 
and (among males) smokmg more cigarettes than nonvolunteers All 
three types of behavior are related conceptually and empincally to 
arousal seekmg In partial support of Schubert’s results are those ob- 
tained by Ora (1966) Though he found volunteers reporting signifi- 
cantly greater consumption of coffee and caffeine pills than was reported 
by nonvolunteers, there were no differences in cigarette smoking between 
volunteers and nonvolunteers However, recent unpublished data col- 
lected by Rosnow and Rosenthal (1967), which are desenbed in greater 
detail later, indicate no overall, significant relationship between volun- 
teering for a “psychological expenment" and cither smoking or coffee 
dnnking In fact, among males, there is a tendenej for volunteers to 
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les, (p - 07) and ta dnnV less “See (p = 06) than ^ 
teers The relationship betiveen smokmg and coffee dni^mg + 

Also inconsistent with Schuberfs results are the Sf “8*° nu^cal expen 
In both his questionnaire study and m his social psyc ° S 
meiit, Poor found no si^icant relationship between s™’'' 8 ‘ ? 

ticipation by his predominantly male subjects In fact, in 
Poor obtained trends opposite to those of Schubert, 
less and reported drinUng less alcohol than nonparticipan 
Myers ct al (1966) found no relationship ,I,at 

teering for isolation experiments On the whole, it does n W 
smokers and coffee drmkers are necessarily overrepresented among 
unteers for behavioral research , 

Fortunately. Schubert’s construct of arousal seeking does not 
heavily on the associated behavior of smokmg and coffee dnntnng 
found that a variety of MMPI scales, associated with arousal s 
discnmmated significantly between volunteers and nonvolunteer 
MMPI characteristics which Schubert found associated 
likelihood of volunteering generally coincide with those noted y 
et al (1962) One important exception, however, is that the nyp 
(Ma) scale scores of the MMPI were found by Schubert 
positively with volunteenng (a result of some importance to v ^ 

seeking hypothesis ) , while London et al found a negative re a lo 
between Ma scores and volunteenng for an hypnosis expenmen 
reversal weakens the generality of the arousal seeking hypothesis, 

IS further weakened by Rosen’s (1951) finding that female vo un 
scored lower on the Ma scale than female nonvolunteers 

Nevertheless, there are other data which lend support to Sc ® 
hypothesis that volunteers are more arousal seeking than nonvo 
Riggs and Kaess (1955) observed diat volunteers were ^[ggj.s 

by more cycloid emotionality on the Guilford Scale than nonvo ^ 
a result that is not inconsistent with Schubert’s finding of hig 
scores among volunteers Howe (1960) reports that volunteers 
to undergo electric shocks were characterized by less need to a 
shock than nonvolunteers, a finding that is not totally tautologica 
one that is consistent with the arousal seeking hypothesis 
Riggs and Kaess also found that volunteers were charactenzed 
introversive thinking on the Guilford than nonvolunteers, a resu 
IS not supportive of the arousal seekmg hypothesis ncept 

Closely related to Schubert’s concept of arousal seeking is 
of sensation seeking discussed by Zuckerman, Schultz, and 
(1967) In a number of studies Zuckerman et al compared vo 
avith nonvolunteers on a specially developed Sensation Seeking 
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(SSS) In one study female undergraduates who volunteered for sensory 
depnvation were found to have scored higher than nonvolunteers on 
the SSS In a second study, which also employed female undergraduates, 
volunteers for an hypnosis expenment saired higher m sensation seeking 
than nonvolunteers In a third study, male undergraduates were invited 
to volunteer for sensory depnvabon and/or hypnosis research Subjects 
who volunteered for both experiments scored highest in sensation seek- 
ing, while those who volunteered for neither task scored lowest on the 
SSS And also on the Ma scale of the MMPI The correlation between 
scores on the SSS and the Ma scale, though statistically significant, is 
low enough (+21) that one would not expect the results obtained 
on the basis of the Ma scale to be attributable entirely to the results 
obtamed with the SSS 

In another study in which Schultz (1967b) solicited volunteers for 
sensory deprivation, male volunteers obtained significantly higher scores 
on the SSS than male nonvolunteers, while among female subjects a 
less clearly significant difference in the same direction was revealed 
Fmally, Schultz (1967c) invited female undergraduates to volunteer 
for a sensory restriction expenment On the basis of scores on the Cattell 
Scales, volunteers could be judged more adventurous than nonvolunteers 

At least when arousal seeking is defined in terms of the Sensation 
Seeking Scale there appears to be substantial support for Schuberts 
hypothesis that volunteers are more arousal seeking than nonvolunteers 
When other tests or scales are used to define arousal seeking the results 
are less consistent, though even then the hypothesis is not completely 
without support 

I. Anxiety 

There is no dearth of studies companng the more or less enduring 
anxiety levels of volunteers and nonvolunteers Table II summanzes 
the results of II of those studies that could be most easily categonzed 
as to outcome In seven of these there appeared to be no difference 
between volunteers and nonvolunteers, Lubin, Brady and Levitt (1962a) 
having employed the IFAT measure of anxiety, and the remaining 
studies using the Taylor Manifest Anxiety Scale or a close relative of 
that scale The tasks for which volunteers were solicited included hypno- 
sis, sensory deprivation, electric shock, Kinsey-type mtcr\iews, small 
groups expenments and an unspecified “psychological expenment On 
the basis of these seven studies one could certainly conclude that, at 
least m terms of manifest anxiety, volunteers arc usually no different 
than nonvolunteers . 

However, results of the other four studies listed in Table II make 
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Studies of 

TABLE II 

THF Anxiety Level of Volunteehs versus 

NoNVOLUNTEEBS 



Volunteers 

more anxious 

No difference 

less anxious 

Rosen {19ol) 

Heihzer (1060) 

Myers ctal (1966) 

Sctubert (1964) 

Himelstein (1956) 

Hood and Back (1967) 

Howe (1960) 

Lubm, Brady, and Levitt (1962a) 

Siegman (19o6) 

Zuckerman, Schultz, and Hopkins (1967) 

Scheier (lO^y; 


it difficult to reach this simple conclusion, since these studies show sig 
mficant differences at around the 05 level Moreover, the fact that two 
find volunteers to be more anxious than nonvolunteers, while two others 
find )ust the opposite relationship, only complicates the attempt to sum 
manze simply the collective results One might be tempted to tahe the 
algebraic mean of the differences in anxiety level found between 
teers and nonvolunteers in these four studies, but that would be h ® 
averaging the temperature of one wmter and one summer and conclu 
ing that there had been two springs Furthermore, one cannot attribute 
the mconsistency in results to the instruments employed to measure 
anxiety Scheier (1959) used the IPAT, Myers et al (1966), Bosen 
(1951), and Schubert (1964) all employed the usual MMPI scales or 
derivatives (Depression Scale, Psychesthenia Scale, or Taylor Mam es 
Anxiety Scale) 

One possibility to explain the inconsistency concerns the anxiety arous 
ing nature of the tasks for which volunteers were sohcited The tas s 
for which more anxious subjects volunteered were an MMPI examination 
(Rosen, 1951) and participating m a psychological experiment” (Schu 
bert, 1964) The tasks for which less anxious subjects volunteered were 
a sensory deprivation study (Myens et al , 1966) and one that Scheier 
(1959) left unspecified but charactenzed as somewhat threatening ® 
a working hypothesis let us suggest that although most often there wiU 
be no difference in the level of chronic anxiety between volunteers an 
nonvolunteers, when such a difference does occur it will be the more 
threatemng experiment that will draw the less anxious volunteer an 
the ordinary expenment that will draw the more anxious volunteer Thus, 
the more anxious subject worries more about the consequences o 
refusing to volunteer, but only so long as the task is not itself perceive^ 
as frightening If it is frightemng then the more anxious and fo^rr 
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subject may decide that he cannot tolerate the additional anxiety that 
his participation would engender and so chooses not to volunteer. 

Some weak support for this hypothesis comes from the work of Martin 
and Marcuse (1958) who obtained a complicated interaction between 
volunteering, anxiety level and task No differences in anxiety level were 
found between volunteers and nonvolunteers of either sex when recruit- 
ing for experiments in learning or attitudes toward sex However, when 
volunteering was requested for a personahty experiment, both male and 
female volunteers were found to be more anxious than nonvolunteers 
When volunteenng was requested for an experiment on hypnosis, male 
volunteers were found to be less anxious than male nonvolunteers, a 
difference that was not obtained among female subjects These results 
seem to parallel the results of the studies listed m Table II Most of 
the time no differences were found in anxiety level between volunteers 
and nonvolunteers When differences were obtained, the volunteers for 
the more ordinary experiments were more anxious, while the (male) 
volunteers for the more unusual, perhaps more threatening, experiments 
were less anxious The results of the Martm and Marcuse research again 
emphasize the importance of the vanable of subject's sex as a moderating 
or complicatmg factor in the relationship between volunteering behavior 
and various personal characteristics 

Further support for the hypothesis that the more fearful the subject, 
the less he will volunteer for a frightening expenment comes from a 
study by Brady, Levitt, and Lubin ( 1961 ) Seventy six student nurses 
were asked to mdicate whether they were afraid of hypnosis Two weeks 
later, volunteers for an expenment in hypnosis were solicited Of those 
nurses who volunteered, 40 per cent had indicated at least some fear 
of hypnosis while among the nonvolunteers more than double that num- 
ber (82 per cent) indicated such fear (p < 0002) It should be noted, 
however, that student nurses who volunteered did not differ m anxiety 
as measured by the IPAT from those who did not volunteer 

Less relevant to the question of volunteenng but quite relevant to 
the related question of who finds their way into the role of research 
subject IS the study by Leipold and James (1962) Male and female 
subjects who failed to appear for a scheduled psychological expenment 
were compared with subjects who Kept their appointments Among the 
female subjects those who appeared did not differ in anxiety on the 
Taylor Scale from those who did not appear Ho\\cvcr, male subjects 
who failed to appear — the determined non\olunteeR — ^^cre significantly 
more anxious than those male subjects who appeared as scheduled Tlicsc 
findings not only emphasize the importance of sex differences in studies 
of %oIuntccr characteristics, but also that it is not enough simply to 
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know who volunteers, even among those subjects 

are hkely to he differences between those who actually show up an 

those who do not 

J. Psychopathology 

We now turn our attention to variables that 
global definitions of psychological adjustment or P^***®^ , 

the variables discussed earlier have also been related o g 
of adjustment, hut our discussion of them was intended 
specJl implications bearing on subjects' adjustment For “"““P 
anxiety was the variable under discussion, it was not intend 
anxious subjects he regarded as more maladjusted ^^jj 

normal range of anxiety scores found, the converse mig 4 

“"^'ere is perhaps a score of studies relevant to the ‘1““^“" 
psychological adjustment of volunteers versus . Jgest 

however, the results ate equivocal About one-third of the st 
that volunteers are better adjusted, another third sugges ® t ^ggrs 
and the remainder reveal no difference in adjustment be e -jicate 
and nonvolunteers We begin by summarizing the les , 
that volunteers are psychologically more healthy than of 

Selfesteem is usually regarded as a correlate, if not a . ’ 

good adjustment Maslow (1942) and Maslow and Sakoda 
marized the results of six studies of volunteering for Kinsey ^^^^^gg^g 
views dealing with respondents’ sexual behavior In ffve case , 
revealed greater self esteem (but not greater secunty ) , ^gnded 

teers as measured by Maslow's own tests The one case 
to show volunteers to be lower in self-esteem than nonwu 
a sample drawn from a class in abnormal psychology 6 _ 

were found to have an atypical distribution of self-esteem sc 

very high and very low scorers were overrepresented among ^ 

teers Some time later, Siegman (1956) also sohcited jo the 

Kinsey type interview, admmistermg his own self-esteem 
subjects He found no differences m self esteem between vo 
nonvolunleers ,o«iiested to 

In one of the studies by Poor (1967), subjects '^T.ZTmay be 
complete and to return a questionnaire “Volunteers ^ ’^gynaires 
thought of as those subjects who returned the completed q yj,erg 

Poor employed a measure of self esteem developed by Mo 
and found ‘volunteers,” or responders, to be higher m se yfjiomes 
-nonvolunleers- (p < 07) A study by Pan { 1951 ) of resid 

for the aged is also, at feast mdneedy, relevant to this d 
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observed that residents who completed and returned his questionnaires 
were in better physical health than were nonrespondents Only because 
it appears that physical and mental health are somewhat correlated 
do Pan’s results imply that respondents may also be better adjusted 
psychologically than nonrespondents There is, however, good reason 
to be cautious about these results, for it is possible that the supennten- 
dents of the homes may have unduly influenced the composition of 
the respondent group by distributing the questionnaires pnmanly to 
residents in good health 

So far we have considered nine samples m which volunteers (le, 
respondents or interviewees) were compared with nonvolunteers (non- 
respondents or nonintervicwees) on adjustment-related variables In 
seven of those nine, volunteers appeared to be the better adjusted In 
one sample, nonvolunteers were the better adjusted, and in one other, 
volunteers did not differ from nonvolunteers in adjustment On the 
whole, then, it would seem that in questionnaire or interview studies, 
respondents wU mainly be those subjects who tend to be psychologically 
well-adjusted 

When volunteering is requested for a typical psychological expenment 
the relationship between adjustment and volunteering becomes more 
equivocal Schubert (1964), for example, found no differences in the 
MMPI scores on neuroticism between volunteers and nonvolunteers for 
a psychological experiment He did find, however, that volunteers tended 
to be more irresponsible than nonvolunteers In one of Poor’s (1967) 
studies, subjects were solicited for a psychological experiment Using 
Rosenberg self esteem measure. Poor Sound a tendency toward lower 
self-esteem among volunteers than among nonvolunteers (p = 14) It 
Will be recalled that Poor also found just the opposite result when solici- 
tation had been of questionnaire returns Finally, Ora (1966), using 
a self-report measure of adjustment found no difference between volun- 
teers and nonvolunteers for various psychological experiments However, 
the volunteers perceived themselves in greater need of psychological 
assistance than did the nonvolunteers even without their feeling them- 
selves more maladjusted There is little basis for any conclusion to be 
drawn from the various findings 

We noted earlier, in discussing “shows” and “no shows,’ that not all 
subjects who volunteer actually become part of the final data pool There 
are two studies of those subjects who fail to keep research appointments 
that appear relevant to the adjustment vanable Silverman (1964) found 
that when participation was requested for a psychological expenment, 
it was subjects higher in self-esteem (as defined by a modified Jams 
and Field measure) who more often failed to keep their appointment 
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This finding seems consistent with that of Poor (1967), who also found 
subjects lower in self esteem more hkely to end up by contnbuting 
data to the behavioral experimenter Wrightsman (1966), however, ob 
served that subjects who failed to keep their research appointments 
scored lower in social responsibility, a finding which is inconsistent with 
Schubert’s (1964) observation that less responsible subjects are those 
more likely to volunteer 

The probable complexity of the relabonship between psychopathology 
and volunteering for a fairly standard, or unspecified psychological ex 
penment is well illustrated in the study by Newman ( 1956) Male volun 
teers were found to be less variable in degree of self actualization than 
male nonvolunteers, whereas female volunteers showed greater vanabil 
ity than female nonvolunteers If one accepts self actuahzation as a mea 
sure of adjustment, the implication is a U shaped relationship between 
adjustment and volunteenng for females when the task is a psychological 
expenment, but an upside down U for males Whereas the best and 
the least adjusted males may be less likely to volunteer than those males 
who are moderately well adjusted, the best and the least adjusted fe 
males may be more likely to volunteer than moderately well adjuste 
females 

When we turn to a consideration of the somewhat less standard types 
of behavioral experiments, we find sinular difficulties in trying to sum 
marize the relationship between volunteering and adjustment Schul z 
(1967c) reports that female undergraduate volunteers for an experimen 
in sensory deprivation scored higher in emotional stabihty on the Catte 
than nonvolunteers However, m an expenment on hypnosis, Hilg^r 
Weitzenhoffer, Landes and Moore (1961) found that female undergra 
uate volunteers scored lower m self control than nonvolunteers Among 
male subjects they report no significant difference in self control between 
volunteers and nonvolunteers In their recruitment for volunteers 
hypnosis research, Lubin Brady, and Levitt { 1962a, 1962b ) foun 
significant difference between student nurse volunteers and nonvo u 
teers that could be attributed to differences in adjustment The tendency 
was, however, for the volunteers to appear somewhat less well adju5 
thin the nonvolunteers as defined by a variety of test scores au / 


obesity 

When the research for which volunteers are solicited takes on a 
cal appearance, the relationship between psychopathology and vo im 
mg IS a little more dear Polhn and Perlin (1958) and Perlm. ® 
and Butler ( 1958 ) have concluded that the more intrinsic! y ^ 
a person is to volunteer for hospitalization as a normal control s j 
the more hkely he is to be maladjusted Similarly, Bell (1962) r^P 
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that volunteers for studies of the effects of high temperature were more 
likely to be maladjusted than nonvolunteers. 

Lasagna and von Felsinger (1954), recruiting subjects for drug re- 
search, noted the high incidence of psychopathology among volunteers. 
A similar finding has also been reported by Esecover, Mahtz, and Wil- 
kens (1961), who solicited volunteers for research on hallucinogens. 
They found that the better adjusted volunteers were motivated more 
by money, scientific curiosity, or because volunteering was a normally 
expected occurrence (e.g., as by medical students). Such findings are 
also consistent with the results of PoIIin and Perlin ( 1958 ) . 

Thus far the results of studies of volunteers for medical research agree 
rather well with one another, consistently tending to show greater psy- 
chopathology among volunteers than nonvolunteers. To this impression, 
however, one must add the results obtained by Richards (1960), who 
compared volunteers and nonvolunteers for a study of mescaline on 
the basis of their responses to the Rorschach and TAT. Though signifi- 
cant differences were obtained, the nature of those differences was such 
that no conclusion could be drawn as to which group was the more 
maladjusted. It can be noted, however, that Richards’ subjects were un- 
dergraduates in the medical sciences, where volunteering might well 
have been the expected behavior for these students who may also have 
been motivated by a preprofessional interest in drug research. Under 
such circumstances, one might not expect volunteers to reveal any excess 
of psychopathology over what one would obtain among nonvolunteers. 

K. Intelligence 

There are several studies showing a difference in intellectual perfor- 
mance between volunteers and nonvoluntccrs. Martin and Marcuse 
(1957) found that volunteers for an experiment in hypnosis scored 
higher on the ACE than nonvolunteers. In a subsequent study, Martin 
and Marcuse (1958) solicited volunteers for three additional experiments 
in personality, learning, and altitudes towards sex. For all four studies 
combined, volunteers were still found to score higher in intelligence 
than nonvolunteers. The definition of intelligence again was the score 
on the ACE, a test employed also by Reuss (1943) in a study of re- 
sponders and nonresponders to a mail questionnaire. Reuss also found 
that “volunteers” (i.c., responders) scored higher in intelligence tlian 
“nonvoluntccrs.” Myers ct al. (1960), however, cmplojlng a U.S. Army 
technical aptitude measure of intelligence, found no significant relation- 
ship between intelligence and volunteering for isolation experiments. 

In a study of high school juniors. Wicker (196S) compared tlic degree 
of partidpalion in bchadoral research of "regular” and “marginal" stu- 
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dents Regular students were defined as those scoring at least 105 on 

:Tiq tesfand who either earned no grades below C in the preceding 

semester or were children of fathers 

sional occupations Marginal students were defined 

scored below 100 in IQ and either having earned t™ or mo 

F grades in the preceding semester or who were child 

in ‘lower occupational categories Of the )uniors m ^ f J 

44 per cent found their way into the research project Of ; 

marginal group less than 14 per cent made their way into the p ) 

^^In requesting student nurses to volunteer for an hypnotic 
periment, Edwards ( 1968) found no relabonship between IQ an 
teenng but somewhat surprisingly, he v „,,ledge 

significantly lower than nonvolunteers on a test of psychiat i , 

(p < 01) and that volunteers were also lower in relative ‘y*® , 

(p = 07) In addition nonvolunteers’ fathers were better edu 

the fathers of volunteers (p = 006) These findings 
direction to those obtained by Wicker, but it can be noted that taw 
subjects were more highly selected from the upper en o 
distribution than were Wickers Edwards’ findings also to 

of Martin and Marcuse but this inconsistency cannot be ^ 

diSerenoes in the general level of intellectual performance 
the two samples , n „pen 

Brower (1948) found that volunteers for a visual motor SK 
ment performed better at difficult visual motor tasl« ^ ^ simple 

nonvolunleers though there was no performance difference i ^ 

visual motor task Wolfgang (1967) solicited j xj^g ship 

learning expenment after which all subjects were adminis ere , 
ley Hartford test of abstract thinkmg ability Male ^as re 

better performance than male nonvolunteers, but no i ©r® com 
vealed between female volunteers and nonvolunteers status 

mon to the studies of Brower and Wolfgang is that A* 

was established before determining the correlate o ^ £ refusmS 

least in pnnciple, it is possible that the act of Thus, 

to volunteer may affect the subject’s subsequent task per ® may 

a subject who has been coerced to participate in an get 

be poorly motivated to perform well at tasks mat ave perfor 

him against his will It remains problemafac expenment m 

mance antedated his decision not to volunteer In ^^^^gred both 

which the test to be correlated avith ^ already 

to volunteers and nonvolunteers once the volunteers ^^^^g pf 

ticipated in an expenment, one must examine carefully 
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the experimental task If the task were similar to the test one would 
expect the volunteers to perform better, since the task would provide 
a practice session for their performance on the test 

When the definition of intelligence is m terms of a standard IQ test 
or even a test of visual-motor skill there are at least a half-dozen samples 
showing that volunteers perform better than nonvolunteers, three sam- 
ples showing no difference in IQ, and no samples showing nonvolunteers 
to perform better than volunteers It appears that when there are differ- 
ences m mtelligence between volunteers and nonvolunteers, the differ- 
ence favors the volunteers 

When the definition of “intelhgence’’ is m terms of school grades, 
the relationship of volunteenng to intelligence becomes more equivocal 
On the one hand, Edwards (1968) found that volunteers stood lower 
in class rankings than nonvolunteers On the other hand. Wicker (1968), 
mcludmg grades in his defimhon of margmality of academic status, 
found that volunteers performed better than nonvolunteers The relation 
ship is further complicated by the fact that Rosen (1951) found no 
difference in grades between male volunteers and male nonvolunteers 
for behavioral research, but among female subjects the volunteers tended 
to earn higher grades (p < 10) Poor (1967) obtained no differences 
in grades or m intellectual interests and aspirations between respondents 
and nonrespondents to a mail questionnaire nor between volunteers and 
nonvolimteers for a psychological expenment Though Abeles, Iscoe, 
and Brown (1954-55) found no overall relationship with volunteenng 
they did find a tendency toward higher grades among early volunteers 
From the few studies in which volunteenng has been correlated with 
school grades it is difficult to draw any clear conclusions If there is 
a relationship between school grades and volunteering it seems to be 
neither strong nor consistent 

Leipold and James (1962) found that female volunteers who showed 
up for the expenment as scheduled had been earning higher grades 
in psychology, among male subjects this relationship was not significant 
Where grades are so specific to a single course they are likely to be 
less well correlated with general intelligence Perhaps grades in a psy- 
chology course are as much a measure of mterest in research as they 
are a measure of intelligence Such an interpretation would be consistent 
'vith the findings of Edgerton, Bntt, and Norman (1947) They reported 
tliat over a period of several jears winners of science talent contests 
responded to a mail questionnaire more than did runners up who, in 
turn, responded more than "also rans " W^ile winners of such contests 
may well be more intelhgent than losers, the greater interest of the 
Winners might be the more potent determinant of their cooperation 
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In a recent study by Matthj^se (1966) a number of volunteers for 
an experiment in attitude change were followed up by mail question 
naire Somewhat surprisingly, but consistent with Edwards’ (1968) find 
mgs, those who responded to the follow up had scored lower m intellec- 
tual efBciency (p < 10) on the California Psychological Inventory 
Relevant to intellectual motivation, if not to intellectual performance, 
IS a consideration of the variable of achievement motivation In his im 
portant review of volunteer characlenstics. Bell (1962) suggests that 
volunteers may be higher m need for achievement than nonvolunteers 
The evidence for this hypothesis, however, is quite indirect More direct 
data have become available from research by Lubin, Levitt, and Zucker 
man (1962) and by Myers et al (1966) In their studies of respondents 
and nonrespondents to a mail questionnaire, and of volunteers and non 
volunteers for an isolation experiment, respectively, a tendency, not sta 
tistically significant, was found for respondents and volunteers to score 
higher in need achievement on the Edwards Personal Preference Sched 
ule than nonrespondents and nonvolunteers 

L Education 

In most of the studies described in the preceding secbon the subjects 
were students, usually m college, and the educational variance was low, 
a finding which is characteristic of experimental studies but not of survey 
research In questionnaire or interview studies the target population 
IS often intended to show considerable vanability of educational back 
ground Among survey researchers there has long been a suspicion tna 
better educated people are those more apt to find their way into the 
final sample The suspicion is well justified by the data Study after 


TABLE HI 

Studies Showing Respondents in Survey Research to be Better Educated 


Authors Date 


Benson, Booman and Clark 

1951 

Franzen and Ziazarsfeld 

1945 

Gaudet and Wilson 

1940 

Pace 

1939 

Pan 

1951 

Reuss 

1943 

Robing 

1963 

Suchman and AfcCaodJess 

1940 

Wallin 

1949 


1956 
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study has shown that it is the better educated person who finds his 
responses constituting the final data pool Since we could find no signifi- 
cant reversals to this relationship, we have simply listed in Table III 
the various studies m support of this conclusion 

M. Social Class 

There is a high degree of correlation between amount of education 
and social class, the latter defined by occupational status One would 
expect, therefore, that in survey research those subjects having higher 
occupational status roles would be more likely to answer questions and 
to answer them sooner tlian subjects lower in occupational status Belson 
(1960), Franzen and Lazarsfeld (1945), and Robins (1963) have all 
found professional workers more likely than lower class jobholders to 
participate m survey research Similarly, Pace (1939) found profes- 
sionals more willing to be interviewed and to return their questionnaires 
more promptly than nonprofessionals Zimmer (1956), in his study of 
Air Force officers and enhsted men, found that probability of respondmg 
to a questionnaire increased directly as the serviceman s rank increased 
Finally, King (1967) found m his survey of Episcopal clergymen that 
questionnaires were more likely to be returned by (a) bishops than 
by rectors, (b) rectors than by curates, and (c) curates than by vestry- 
men The sharp and statistically significant break came between the 
rectors and curates However, there is some possibility in King’s study 
that not all of the curates and vestrymen actually received the 
questionnaires 

Even discounting King’s results, the trend is clear At least for the 
range of occupational statuses noted here, higher status role occupants 
are more likely to participate in the survey research process than those 
lower m status 

We have been talking of the volunteer’s own social class as defined 
by his occupational status The picture becomes more complicated when 
we consider the social class of the volunteer's parents — his class of origin 
Edwards (1968) reports that the fathers of volunteers for an hypnotic 
dream experiment had a lower educational level than the fathers of 
nonvolunteers Similarly, Reuss (1943) found that the parents of respon 
dents to a mail questionnaire had less education than the parents of 
nonrespondents Rosen (1951) notes that the fathers of female volunteers 
for psychological research had a lower income than the fathers of non- 
volunteers Poor (1967), however, found no relationship behveen father’s 
occupational status and either respondmg to a questionnaire or volun 
teenng for an experiment The trend, if any, was for the fathers of 
respondents to have a higher occupational status than the fathers of 
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nonrespondents Finally, the reader will recall the study by Wicker 
(1968), m which marginal students participated in research less often 
than nonmargmal students For Wicker, father’s occupation was part 
of the definition of marginality If father’s occupational status, then, 
made any difference at all, it was die children of higher status fathers 
who more often produced data for the behavioral researcher To sum 
manze, when father’s social class makes the clearest difference it seems 
that children of lower class fathers are the most likely to volunteer 
Since the evidence seems so clear that subjects vvho are themselves 
lower class members are less hkely to volunteer, the hypothesis is sug 
gested that those higher status persons are most likely to volunteer whose 
background includes vertical social mobility In keeping with a point 
made earlier in this chapter, these latter persons may be those who 
at least in survey research would perceive themselves as having the 
most interesting and most acceptable answers to the investigators 
questions 

N Age 

There are over a dozen studies addressed to the question of age differ 
ences between volunteers and nonvolunteers As we already have seen 
with other variables, often there is no significant difference between 
the ages of volunteers and nonvolunteers However, when 
are found they most often surest that younger rather than older subjects 
are those who volunteer for behavioral research Abeles, Iscoe, an 
Brown { 1954-55 ) found this to be the case in their questionnaire stu y 
Newman (1956) observed the same relationship for personality an 
perception experiments although the age difference in the latter expen 
ment was not statistically significant In another experiment m percep 
tion, however, Manner (1967) found volunteers to be significan y 
younger than nonvolunteers Rosen (1951) found that female volunteers 
were significantly younger than female nonvolunteers, a diSerence 
which, however, did not hold for male subjects For research with co eg^ 
students then, even when the general trend is for volunteers 
younger, the sex of the subjects and the type of experiment for w 
volunteering is requested seem to complicate the relationship ® 
age and volunteering 

From studies not employing the usual college samples there ‘ 
some evidence for the greater youthfulness of volunteers 
Murphy, Smith, and Goffard (1966) requested Army personnel to vo 
teer for a study of perceptual isolation and found volunteer 
younger Pan (1951), too, found his respondents 
homes for the aged to be younger than the nonrespondents 
tendency was reported by Wallin (1949) 
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Nevertheless, opposite results also have been reported Thus, in King’s 
(1967) study it was the older clerg)wcn who were most likely to reply 
to a bnef questionnaire In that study, however, age was very much 
confounded with position in the Church’s status hierarchy That situation 
seems to hold for the study by Zimmer (1956) as well He found older 
Air Force men to respond more readily to a mail questionnaire, but 
the older airmen were also those of higher rank Kruglov and Davidson 
(1953), in their study of male undergraduates, found the older students 
more wilhng to be interviewed than the younger students Even allowing 
for the confounding effects of status in the studies by King and by 
Zimmer, the three studies just described weaken considerably the hy 
pothesis that volunteers tend to be younger than nonvolunteers That 
hypothesis is weakened further by several studies showing no differences 
in age behveen volunteers and nonvolunteers (Benson, Booman, and 
Clark, 1951, Edwards, 1968, Poor, 1967) 

Further evidence for the potentially complicated nature of the relation- 
ship behveen age and volunteenng comes from the work of Gaudet 
and Wilson (1940) They found that their determined nonvolunteers 
for a personal interview tended to be of intermediate ages with the 
younger and older householders more willing to participate Similarly, 
curvihneanties have been reported by Newman (1956), though for his 
collegiate sample the curvihneanty was opposite in direction to that 
found by Gaudet and Wilson Especially among Newmans female sub- 
jects, the nonvolunteers showed more extreme ages than did the volun- 
teers The same tendency was found among male subjects, but it was 
significant for males only among subjects recruited for a personality 
expenment, not among those recruited for a perception experiment 

O. Religion 

The data are sparse that bear on the question of the relationship 
between volunteenng and religious affihation and attitudes Matthysse 
(1966) found his respondents to a mail questionnaire to be dispropor- 
tionately more often Jewish than Protestant and also to be more con- 
cerned ^vlth theological issues The latter finding is not surpnsing since 
the questionnaire dealt with rehgious attitudes Rosen ( 1951 ) also found 
Jews to be significantly overrepresented in his sample of volunteers for 
psychological research In addition, Rosen found volunteers to be less 
likely to attend church services than nonvolunteers However, Ora 
(1966) found no relationship bet^veen rehgious preference or church 
attendance and volunteering for various psychological expenments 

In his interview research, Wallm (1949) found Protestants somewhat 
more Ukely than Catholics to participate m a study of the prediction 
of mantal success The tenuousness of these findings is well illustrated 
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m a study by Poor (1967) He, too, found no significant relationship 
between volunteenng and either rehgious affiliation or church atten 
dance However, in his study of respondents to a mail questionnaire 
he reports a trend towards greater participation among Protestants than 
among Catholics or Jews, while in his study of volunteers for a psycho 
logical expenment, he reports a trend towards greatest participation 
among Catholics and least partipation among Jews In a study of stu 
dent nurses, Edwards (1968) found no association between volunteenng 
and religious attitudes 

In summary, then, it is not possible to make any general statement 
concerning the relationship between volunteenng and either religious 
affiliation, religious attitudes, or church attendance On the basis of our 
earher discussion, one might speculate that if any relationship does exist 
it IS probably complicated by subjects’ sex and the type of research 
for which participation is solicited 


P, Geographic Variables 

In his study of respondents to a mail questionnaire, Reuss (1943) 
found greater participation by subjects from a rural rather than an urban 
background In his study of college students, however, Rosen (1951) 
found no such difference between volimteers and nonvolunteers Siegman 
(1956) reports that for a Kinsey type interview, volunteenng rates were 
higher in an Eastern than in a Midwestern university Presumably, there 
were more students of rural origin in the Midwestern sample Perhaps 
the nature of the volunteer request interacts with rural urban origin 
to determine volunteenng rates Fmally, Franzen and Lazarsfeld (1945) 
found that respondents to their mail questionnaire were overrepresente 
by residents of die East Central States but underrepresented by residents 
of New England and the Middle Atlantic States In addition, residents 
of cities with a population of less than 100,000 were overrepresente 

relative to residents of cities of larger population In view of the 
ness of the obtained data it seems best to forego any summary ^ ^ 
relationship between geographic variables and volunteenng for ® 
havioral research 


m. POPULATIONS INVESTIGATED 

Before summanzing what is known and not known about differentia 
mg charactenstics of volunteers for behavioral research, let us can 
the populations that have been discussed All of the studies cite 
sampled from populations of human subjects, situations, tasks, con 
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personal characteristics, and vanous measures of those charactenstics 
(Brunswik, 1956). 

A. Human Subjects 

In his study of subject samples drawn for psychological research, 
Smart (1966) examined every article in the Journal of Abnormal and 
Social Psychology appearing in the years 1962-1964 Less than 1 per 
cent of the studies employed samples from the general population, 73 
per cent used college students, and 32 per cent used mtroductory psy 
chology students Comparable data from the Journal of Experimental 
Psychology revealed no studies that had sampled from the general popu- 
lation, 86 per cent of the studies employed college students, and 42 
per cent used students enrolled in introductory psychology courses 
These data provide strong, current support for McNemar’s 20 year-old 
cnticism of behavioral science’s being largely a science of the behavior 
of sophomores 

The studies discussed in this chapter provide additional support for 
McNemar’s contention The vast majonty of the psychological experi- 
ments drew their subject samples from college populations When the 
studies were in the nature of surveys, a much broader cross section 
of subject populations was tapped, but even then college students were 
heavily represented Such a great reliance on college populations may 
be undesirable from the standpoint of the representativeness of design 
in behavioral research generally, but it does not reflect an ecological 
invahdity for our present purpose Sampbng of subject populations in 
studies of volunteer characteristics seems to be representative of sam 
phng of subject populations in behavioral research generally 

B Situations, Tasks, and Contexts 

A considerable variety of situations, tasks, and contexts were sampled 
by the studies discussed here The tasks for which volunteering was re- 
quested included survey quesbonnaires, Kinsey-type interviews, psycho 
pharmacological and medical control studies and vanous psychological 
experiments focusing upon small group interaction, sensory deprivation, 
hypnosis, personahty, perception, learning, and motor skills Unfor- 
tunately, very few studies have employed more than one task, hence, 
little IS known about the effects of the specific task either on the rate 
of volunteenng to undertake it or on the nature of the relationship 
beUveen volunteenng and the personal diaraclcnstics of volunteers 

There is, however, some information bcanng on this problem Nc^vman 
(1956), for example, employed more than one task, asking subjects to 
volunteer both for a personality and a perception expenment, but he 
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in die opposite direction For organizational and heuristic purposes, how- 
ever, we have grouped the many findings together under a fairly small 
number of headings Decisions to group any variables under a given 
heading were made on the basis of empirically established or concep- 
tually meanmgful relationships 

It should further be noted that within any category several different 
operational definitions may have been employed Thus, we have dis- 
cussed anxiety as defined by the Taylor Manifest Anxiety Scale as well 
as by the Pt scale of the MMPI This practice was made necessary 
by the limited number of available studies employing identical opera- 
tional definitions, except possibly for age and sex This necessity, how- 
ever, IS not unmixed with virtue If, in spite of differences of operational 
definition, the vanables serve to predict the act of volunteenng, we 
can feel greater confidence in the construct underlying the varying defini 
tions and in its relevance to the predictive and conceptual task at hand 


IV. SUMMARY OF VOLUNTEER CHARACTERISTICS 

For this mass of studies some attempt at summary is essential Each 
of the charactenstics sometimes associated with volunteenng has been 
placed into one of three groups of statements In the first group, state- 
ments or hypotheses are listed for which the evidence seems strongest 
Though in absolute terms our confidence may not be so great, in relative 
terms we have most confidence in the propositions listed in this group 
In the second group, statements or hypotheses are listed for which the 
evidence, though not unequivocal, seems clearly to lean m favor of 
the proposition At least some confidence m these statements seems war- 
ranted In the third group, statements or hypotheses are listed for which 
the evidence is unconvincing Little confidence seems warranted in these 
propositions Within each of the three groups of statements, the hypothe- 
sized relationships are listed in roughly descending order of warranted 
confidence 

A. Statements Warranting Most Confidence 

1 Volunteers tend to be better educated than nonvolunteers 

2 Volunteers tend to have higher occupational status than nonvolun- 
teers (though volunteers may more often come from a lower status 
background) 

3 Volunteers tend to be higher in the need for approval than non- 
volunteers (though the rebitionship may be curvilinear with least 
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volunteering likely among those with average levels of need 

4 Volunteers, especially males, tend to "eer'not 

teers on tests of intelligence (though school grades 

clearly related to volunteenng) 

5 Volunteers tend to be less authoritarian than nonvolunteers, p 
cially when asked to answer personal quesbons 

6 Volunteers tend to be better adjusted than nonv 

asked to answer personal questions, but more , i 

asked to participate m medical research (In psycho g 
ments the relationship is equivocal ) 

B Statements Warranting Some Confidence 

7 Volunteers tend to be more sociable than 

8 Volunteers tend to be more arousal seeking t an 

9 Volunteers tend to be more unconvenbonal than jjstbom 

10 Volunteers tend more often than nonvolunteers ® ,, „I,en 

11 Volunteers tend to be younger than nonvolunteers, e p X 

occupational status is partialled out females when 

12 Volunteers tend mote often than nonvolunteers to 

the task is standard and males when the task is unusu 

C. Statements Warranting Little Confidence 

13 Volunteers tend to be more anxious than 

task IS standard and less anxious when the task is ““ ® 

14 Male volunteers are less contormmg than male j, 

15 Volunteers tend more often than nonvolunteers to 

16 Volunteers more than nonvolunteers tend to be o 
when the task is standard and of urban ongm w 

is unusual mvesh 

It is obvious that the hypothesized relationships ^"'bttle reas® 

gation, especially those falhng lower in the lists There 
for thinking that an expenmentum cruas would place a y ^ 
tionships on firmer footing Many have been jndings u 

even a score of studies When the summary of th a 

equivocal, it is readily apparent that it will take mo 
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V. IMPLICATIONS FOR REPRESENTATIVENESS 

The results of our analysis suggest that in any given study of human 
behavior the chances are good that those subjects who find their way 
into the research will differ appreciably from those subjects who do 
not Even if the direction of difference is not highly predictable, it is 
important to know that volunteers for behavioral research are likely 
to differ from nonvolunteers in a vanety of characteristics One implica- 
tion of this conclusion is that limitations may be imposed on the gen- 
erahty of finding of research employing volunteer subjects It is well 
known that the violation of the requuement of random sampling compli- 
cates the process of statistical inference This problem is discussed m 
basic texts on samphng theory and has also been dealt with by some 
of the workers previously cited (eg, Cochran, Mosteller, and Tukey, 
1953) 

Granted that volunteers are never a random sample of the population 
from which they were recruited, and granting further that a given sample 
of volunteers differs on a number of important dimensions from a sample 
of nonvolunteers, we still do not know whether volunteer status is a 
condition that actually makes any great difference with regard to our 
dependent variables It is possible that in a given experiment the per- 
formance of the volunteer subjects would not differ at all from the 
performance of the unsampled nonvolunteers if the latter had actually 
been recruited for the experiment (Lasagna and von Felsinger, 1954) 
The point is that substantively we have httle idea of the effect of using 
volunteer subjects What is needed are senes of investigations covering 
a vanety of tasks and situations for which volunteers are solicited, but 
for which both volunteers and nonvolunteers are actually used Thus, 
We could determine in what types of studies the use of volunteers 
actually makes a difference, as well as the kinds of differences and their 
magnitude When more information is available, we can, with better 
conscience, enjoy the convemence of usmg volunteer subjects In the 
meantime, the best one can do is to hypothesize what the effects of 
volunteer characteristics might be in any given hne of inquiry 

Let us take, as an example of this procedure, the much analyzed 
Kinsey-type study of sexual behavior Wc have already seen how volun- 
teers for this type of study tend to have unconventional attitudes about 
sexuality and may in addition behave in sexually unconventional vvajs 
Tins tendency, as has frequently been noted, maj have had grave effects 
on the outcome of Kinscy-typc research, possibly leading to population 
estimates of sexual behavior seriously biased in the unconventional dircc- 
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bon The extent of this type of bias could probably be partially assessed 
over a population of college students among whom the nonvolunteers 
could be converted into ‘volunteers” m order to estimate the effect on 
data outcome of initial volunteering versus nonvolunteenng Clearly, 
such a study vv ould be less feasible among a population of householders 
who stood to gam no course credit or instructor’s approval from chang 
mg tlieir status of nonvolunteer to volunteer 
The experiment by Hood and Back (1967) has special impbcations 
for small goups research At least among male subjects these inv csbgatois 
found V olunteers to bo more xviUmg than nonvolunteers to disclose per 
sonal information to others Among female subjects the relabonship be- 
bveen volunteering and self-disclosure was compheated by the nature 
of the experiment for which v'olunteenng liad been requested Especiall) 
among males, then, small groups experiments that depend upon volun 
teer subjects may gtie inflated esbmates of group members’ willingness 


to participate opcnl)' in group interaction . 

In a more standard realm of experimental psychology, Greene (1937j^ 
showed that precision m discnminahon tasks was related to subjects 
intelhgence and t)'pe of personal adjustment To the extent that \ olun 
teers differ from nonvolunteers in adjustment and intelligence, typica 
performance levels m discnmmalion tasks may be misjudged v hen > o 
unteer samples are employed It seems reasonable to wonder, too, a ou 
the effect of the volunteer variable on the normabxe data required 
the standardization of an intelhgence test Since, at least in tlic stan 
dardizabon of intelligence tests for adults, volunteers tend to be ove 
represented, and since volunteers tend to score higher on tests of 
gence, tlie “mean” IQ of 100 may represent ratlier an inflabon o 
true mean that would be obtained from a more truly random sample 
It was suggested earher that it might be useful to assess the nngni 
of volunteer bias by converting nonv olunteers to volunteers 
one problem wath increasmg the pressure to volunteer in a samp ^ 
nonv olunteers is tliat the expenence of havang been coerced may ^ o 
the subjects’ responses to tlie expenmental task That seems 
likely m situabons where nonv olunteers are inibally led to be lei 
tliey are free not to \ olunteer One partial solution to this prob em ” 
be to recruit volunteers from among nonv olunteers using mcr 
posibve incenbves, a technique that has met with some succ«s m ^ 
research Even then, however, we must try to assess the eilec 
subject’s response of having been sent two letters rather t an 
of hav ing been offered $2 rather than $l as spurs to \ olunleenng 
If volunteers differ from nom olunteers in their response 
set by the mvesbgator, the employment of volunteer samp cs c 



THE VOLUNTEER SUBJECT 


101 


senous eflFects on estimates of such parameters as means, medians, pro- 
portions, vanances, skewness, and Imrtosis In survey research, where 
the estimation of such parameters is the principal goal, biasing effects 
of volunteer samples could be disastrous In most behavioral expen- 
ments, however, interest is not centered so much on such statistics as 
means and proportions but rather on such stabstics as the differences 
between means or proportions The investigator is ordinarily interested 
in relating such differences to the operation of his independent variable 
The fact that volunteers differ from nonvolunteers in their scores on the 
dependent vanable may be quite irrelevant to the behavioral expen 
menter He may want more to know whether the magnitude and statist! 
cal significance of the difference between his expenmental and control 
group means would be affected if he used \olunteers In other ^vords, 
he may be mterested in knowmg whether volunteer status mteracts with 
his expenmental vanable 


VI. IMPLICATIONS FOR EXPERIMENTAL OUTCOMES 

In this section we shall descnbe the evidence relevant to the problem 
of interaction of volunteer status with vanous expenmental vanables 
Compared to the evidence amassed to show inherent differences between 
volunteers and nonvolunteers there is little evidence available from 
which to decide whether volunteeer status is likely to interact with 
experimental vanables On logical grounds alone one might expect such 
interactions If we assume for the moment that volunteers are more 
often firstborn than are nonvolunteers, some research by Dittes (1961) 
becomes highly relevant Dittes found that lessened acceptance by peers 
affected the behavior of firstborns but not that of laterboms Still assum- 
ing firstborns to be overrepresented by volunteers, a study of the expen- 
mental vanable of “lessened acceptance” conducted on volunteers might 
show strong effects, while the same study conducted with a more nearly 
random sample of subjects might show only weak effects 

We can imagine, too, an expenment to test the effects of some expen- 
mental manipulation on the dependent vanable of greganousness If a 
sample of highly sociable volunteers xvere drawn, any manipulation de- 
signed to increase greganousness might be too harshly judged as ineffec- 
tive simply because the untreated control group would already be un- 
usually high on this factor The same manipulation might prove effective 
m increasing the greganousness of the expenmental group relative to 
the greganousness of tlie control group if the total subject sample 
■"ere charactenzed by a less restneted range of sociabilit) At least 
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in pnnciple, then, the use of volunteer subjects could lead to an increase 

‘"xhTopmshrtype of error can also be imagined Suppose “n “vesti 
gator were interested in the relationship between the psyc o “S' ^ 

justment of women and some dependent variable If fema e 
are indeed more variable than female nonvolnnteers on the dimensio 
of adjustment, and if there were some relationship between 
and the dependent variable, then the magnitude of that rehtio h p 
would be overestimated when calculated for a samp e o vo un 
tive to a sample of nonvolunteers So far m our discussion J"; 
only with speculations about the possible effects of vo u 
experimental or correlational outcomes Fortunately, we 
stncted to speculation, since there are several recent studies ad 
to this problem 

A TheHayes,Meltzer, and Lundberg Study (196S) 

In this experiment the investigators were interested in s 

effects on the subjects vocal participation in dyadic task onen 
of (a) his possession of task revehnt information (b) his co i 
possession of task relevant information and (c) the joint poss 
task-relevant information The dyads task was j.-gram 

how to build a complex tinkerloy structure The builder had - 
but each of the two instructors did Amount of task re eva 
tion was varied by the use of good average, and poor lagr 
third of the 120 undergraduates were assigned to each ^ 

levels of information, and within each of these three Within 

cussants were given either good, average or poor in ^ paid 
each of the nine conditions so generated, half the su jec 
volunteers and half were required to serve Results s owe 
were no effects on vocal activity of a subjects own eve o , 
but that subjects talked least when their partner had mos i 
In addition, paid volunteers participated significant y chapter 

the consenpted subjects, a finding also cited earhw m be 

The results in which we are most interested are t e m i 


— o — 

The results in which we are most interested are - anipulad®^^ 
tween the volunteering variable and the expennienta m 
None of those Fs approached significance, indeed a ^olunteer^ 

unity From this experiment one might conclude t at, are 

and conscriptees differ m important ways from one ’ gntnl jn® 

nevertheless similarly affected by the operation of e 
nipulabon No serious errors of inference would have o 
mveshgalors used a sample composed entirely “"j,gj,,ence bet''«" 

Perhaps we should be surprised to discover any 
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the volunteers and the conscriptees Conscnptees, after all, are not 
nonvolunteers, but rather a mixed group, some of whom would have 
volunteered had they been invited and some of whom (the bona fide 
nonvolunteers) would have refused A comparison between a group of 
volunteers and a group comprised of volunteers plus nonvolunteers (m 
unknown proportion) should not yield a difference so large as a com- 
panson between volunteers and nonvolunteers Perhaps, then, when vol- 
unteers and nonvolunteers are more clearly differentiated there may be 
significant interactions between the experimental variable and the volun- 
teer variable 

Before leaving the Hayes, Meltzer, and Lundberg study one additional 
possibility can be noted Since the conscnptees and the paid volunteers 
were contacted at different times of the school year, it is possible that 
the differences between the two groups were confounded by temporal 
academic vanables, e g , time to next examination penod as well as 
by differences m the subject pools available at the two periods of the 
year 

B, The Rosnow and Rosenthal Study (1966) 

The primary purpose of this expenraent was to examine the differential 
effects of persuasive communications on volunteer and nonvolunteer sam 
pies Approximately half of the 42 female undergraduates had volunteered 
for a fictitious experiment in perception, and half had not volunteered 
Both the volunteer and nonvolunteer subjects then were assigned at 
randon to one of three groups One group of subjects was exposed to 
a pro fraternity communication, a second group was exposed to an anti- 
fratermty communication, and a third group received neither communi- 
cation For all subjects, prior opinions about fraternities had been unob- 
trusively measured one week earlier by means of fraternity opinion items 
embedded m a 16 item opinion survey After exposure to pro , anti-. 
Or no communication about fraternities, subjects were retested for their 
opimons 

Table IV shows the mean opinion change scores for each experimental 
condition separately for volunteers and nonvolunteers The associated 
probability levels are based on t tests for correlated means, and they 
suggest that while opinion changes were not dramatic in their p values, 
they were large in magnitude and greater than might be ascribed to 
chance In a set of six ps, only one would be expected to reach the 
17 level by chance alone In this study, three of the six ps reached 
that level despite the average of only 6 (If per group The onlj group 
to show opinion change significant at tlie 03 le\cl was tint compnsed 
of \olunteers exposed to the anti-fratemity communication 
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Table V shows an altemaUve analysis in which the expenmental 
groups' opinion changes are compared with one another separate y or 
volunteers and nonvolunteers. The overall effect of the pro- versus an 
fraternity communications was not significantly greater arnong vo un eers 
than among nonvolunteers The magnitude of the effect, however 
reached a p of .004 among volunteers compared to a p ot .Ut am g 


TABLE IV 

Opinion Change Among Volunteers and Nonvolunteers 


Volunteers Nonvolimteers ^ 


Treatment 


Change Two-tail p < Change Tivo-taJ !> 


Pro-fraternity 

Control 

Anti'fraternity 


4-1 67 (9) 

20 

+2 50 (6) 

4-0 40 (5) 

90 

-0 91 (11) 

-3 50 (6) 

05 

-1 20 (5) 


15 

15 

50 


Note A positive valence indicates that opinions changed in a pro-fra erni y 
a negative valence, in an anti-fraternity direction Numbers in paren e 
the sample size 


TABLE V 


Effectiveness of One-Sided Communications 
Among Volunteers and Nonvolunteers 


Treatment difference 


Volunteers 


Nonvolunteers 


Pro minus control 4-1 27 +3 ^1 

Control minus anti 4-3 90* 4-0 2 

Pro minus anti 4-5 17“ 4-3 • 


“ p = .004 
* p = 05 
*p = 07 

nonvolunteers. An investigator employing a strict decision model 
ence and adopting an alpha level of .05 or .01 would 
different conclusions had his experiment been conducted wi 
rather than nonvolunteers nmmunica* 

Table V also shows that for volunteers the com- 

don was more effective, while for nonvolunteers the pro- ra 
munication was more effective (interaction p < .05). It •'‘PP ^^j^ipula- 
that volunteer status can, at times, interact with expenmen 
dons to affect experimental outcomes. 
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We can only speculate on why the particular interaction occurred 
Some evidence is available to suggest that faculty experimenters were 
seen as being moderately anti-fratemity Perhaps volunteers, who tend 
to show a greater need for approval, felt they would please the experi- 
menter more by being more responsive to his anti-fratemity communica- 
tion than to his pro-fratemity communication That does not explain, 
however, why nonvolunteers tended to show the opposite effect unless 
we assume that they also saw the expenmenter as being more anti fra- 
ternity and resisted giving in to what they saw as his unwarranted influ 
ence attempts 

Within each of the three experimental conditions the pretest posttest 
reliabilities were computed separately for volunteers and nonvolunteers 
The mean reliability ( rho ) of the volunteer subjects was 35, significantly 
lower than the mean reliability of the nonvolunteers (97) at p < 0005 
Volunteers, then, were more heterogeneous in their opinion change be 
havior, perhaps reflecting their greater willingness to be influenced in 
the direction they felt was demanded by the situation (see the chapter 
by Ome) The findings of this study as well as our review of volunteer 
characteristics suggest that volunteers may more often than nonvolun- 
teers be motivated to confirm what they perceive to be the experimenter's 
hypothesis 

C. The Rosnow and Rosenthal Study (1967) 

This experiment will be described in greater detail than the other 
studies summarized because it has not previously been published ® As 
m our earlier study (Rosnow and Rosenthal, 1966), the primary purpose 
was to examine the differential effects of communications on volunteer 
and nonvolunteer samples In this study, however, two sided as well 
as one sided communications were employed 

Four introductory sociology classes at Boston University provided the 
103 male and 160 female subjects All of the students were invited by 
their instructors to volunteer for either or both of two fictitious psycho- 
logical experiments, one of the experiments purported to deal with psy- 
cho acoustics, the other, with social groups Approximately one week 
later, all of the subjects m each class simultaneously were presented 
by the experimenter with one of five different booklets, representing 
the five treatments m this after only design On the cover page of every 
booklet were four items, which inquired as to the subjects (a) sex, 
(b) cigarette smoking habit, (c) coffee drinking habit, and (d) order 
of birth These items were followed, beginning on the next page, by 

* We thank Robert Holz, Robert Margolis, and Jeffrey Saloway for their help 
la recniihng volunteers and George Siniltens for his help in data processing 
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a one sided, two sided, or control communication describing a bnef epi 
sode in a day in the life of "Jim,” a fictitious individua based on the 
character described by Luchins (1957) The last page of every booklet 
contained four 9 pomt graphic scales on which the ™b)ect was as 
to rate Jim m terms of how (a) fnendly or unfrrendly, (b) fonvar 
or shy, (c) social or unsocial, (d) aggressive or passive he see , 
bised on the information contained in the communication 

Communications Two one sided communications were use 
of the communications was a positive appeal (P), which portraye J 


as friendly and outgoing 

Jim left the house to get some stationery He walked out into 
the sun filled street with two of his friends, basking m t e sun a 
he walked Jim entered the stationery store which was tul 
people Jim talked with an acquaintance while he waited or 
clerk to catch his eye On his way out, he stopped to chat wi 
school fnend who was just coming into the store Leaving ® ’ 

he walked toward school On his way out he met the gin o 
he had been introduced the night before They talked for a s 
while, and then Jim left for school 

The other one sided communicahon was a negative appeal (N) It P 


traced Jim as shy and unfnendly 

After school Jim left the classroom alone 
he started on his long walk home The street was bnlha^ntly 
wnth sunshine Jim walked down the street on the s 
Coming douTi the street toward him, he saw the pretty gir 
he had met on the previous evening Jim crossed the strcc 
entered a candy store The store ^vas crowded with stu en 
he noticed a few familiar faces Jim wailed quietly unti t ic 
terman caught his eye and then gave his order Taking nis 
he sat do^^'n at a side tabic ^Vhcn he had finished his n 


%\cnt home 

were can 

Two other dcscnptions, or two sided communications, 
slructcd bj combining the positive and negative appea 
P description was immediately followed by the N dcscrip 
a paragraph indentation between the two passages, we j^fer 

sided communication as PN When N immediately precede i 
to this two sided communication an NP All four communica 
introduced b> the following passage ^ 

In everyday life v\c sometimes form impressions of ? j,ni 
on vhat we read or hear about them On a gi'cn sc i 
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walks down the street, sees a girl he knows, buys some stationery, 
stops at the candy store On the next page you will find a paragraph 
about Jim Please read the paragraph through only once On the 
basis of this information alone, answer to the best of your ability 
the questions on the last page of this booklet 

The control subjects received just the introductory passage above 
With the sentences omitted referring to the paragraph on the followmg 
page, succeeded immediately by the four rating scales 
Some of the results of this study have already been given in the 
relevant sections of our discussion of volunteer characteristics Thus, 
females volunteered significantly more than males (X- = 1178, df — 2, 
p < 005), birth order was unrelated to volunteering (X^ = 0 54, df — 2, 
p > 75), for the total sample, and for female subjects alone, smoking 
and coffee drinking were unrelated to volunteenng Among male subjects, 
however, volunteers tended to smoke less (p = 07) and dnnk less coffee 
(p = 06) than nonvolunteers Even among male subjects, smoking and 
drinking accounted for less than 4 per cent of the variance m volunteer- 
ing behavior 

Because of the very unequal numbers of subjects within subgroups, 
all analyses of each of the four ratings made by subjects were based 
on unweighted means The analyses of variance of the five treatments 
by volunteer status by sex of subject showed only significant effects 
of treatments For each of the four ratings analyzed in turn, ps for 
treatment were less than 001 In these overall analyses no other ps 
were less than 05 

Our greatest interest, however, is in the interaction of volunteer status 
with treatments This interaction was computed for each of the four 
dependent variables, and only two of the associated ps were less than 
20 For the variable “friendly,” p was 12, for the vanable "social,” 
P was 17 With so many treatment conditions, however, these Fs for 
unordered means are relatively insensitive, and it may be instructive 
to examine separately for volunteers and nonvolunteers the specific ex 
penmental effects in which we are most interested Table VI shows 
separately for volunteers and nonvolunteers the difference bchveen the 
control group mean and the mean of each of the four expenmental 
groups in turn Each of ihe entries in Table VI is based on the data 
from both male and female subjects combined without weighting 
It can be seen from Table VI that the two one-sided communications 
were most effecti%e among all subjects Although these effects were not 
Significantly greater among volunteers than among nonvoluntcers, the 
trend was in that direction Of the eight lests-foc the significance of 
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TABLE VI 

Effectiveness Among Volunteers and Nonvolunteers of One Sided antj 
_ ^iTTT^ A 7 .KRO Control 

Nonvolunteers 


Treatment difference 

Ratings 

Volunteers 

Positive (P) 
minus control 

Fnendly 

Forward 

Social 

Aggressive 

+2 18“ 

+ 1 45- 
-i-1 92- 
+1 15» 

Negative (N) minus control 

Friendly 

Forward 

Social 

Aggressive 

-1 55- 
-2 04“ 
-2 22“ 
-1 36“ 

PN minus control 

Friendly 

Forward 

Social 

Aggressive 

-0 58 
-0 34 
-1 08» 
-0 34 

NP minus control 

Friendly 

Forward 

Social 

Aggressive 

+0 30 
+0 04 
-0 30 
-0 45 


+ 1 78» 
+0 42 
+1 30» 
+0 20 

-0 70 
-1 68 - 
_1 91- 
-1 56- 

+0 64 
^0 24 
+0 24 
-0 34 

+0 38 
-0 52 
+0 06 
-0 52 


Note Ratings could range from +4 00, or favoring strongly the posi iv 
—4 00, strongly favoring the negative appeal Entries in the tab e w® 
bet^\ecn means of treatment conditions 


“p < 01 

the effectiveness of one-sided communications, all eight 

05 level among volunteers, while only five of the eight tes s 

the 05 level among nonvolunteers An investigator emp 

sample sizes, and a stnet decision model of inference 'Vi 

of either 05 or 01, would arrive at different conclusions 

of the time were he to employ volunteer rather than nonvo u 

pies Most important perhaps, is that whenever differences m 

levels occurred it was the volunteers who favored the more 

pothesis When the communication was positive, volunteers vol 

positive than nonvolunteers, when the communication was neg 
unteers became more negative thin nonvolunteers there 

\Vhen we consider the effects of two sided 
pears to be less volunteer bias The two sided communica 
ineffective generally regardless of whether they were . jjiovn) 

control group (as shoum m Table VI) or to each other (n 
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Of the 16 mean difFcrenccs indicating the effectiveness of the two sided 
communications shown in Table VI, only one was significant at the 
05 level, just about wlnt one might expect by chance Nevertheless, 
that one “effect” occurred among volunteers 

D. The Manner Study (1967) 

Following a similar recruitment procedure as was used in the preced 
mg study, Marmer administered to both volunteer and nonvolunteer 
subjects treatments adapted from a standard deception introduced by 
cognitive dissonance theorists to study the effects of decisional impor 
tance and the relative attractiveness of unchosen alternatives on post- 
decisional dissonance rcducUon Tlic study was earned out in three 
phases In the first phase, undergraduate women at Boston University 
were recruited for a fictitious psychology experiment The second phase, 
which began immediately thereafter, consisted of having the subjects — 
volunteer and nonvoluntcer alike — complete an opinion survey that was 
represented to them as a national opinion poll of college students being 
conducted by the University of Wisconsin The third phase was earned 
out one month later At that time a third experimenter, who was repre 
sented as an employee of the Boston University Communication Re 
search Center, had the subjects choose between two alternative ideas 
whose importance they had evaluated in Phase II The ideas included, 
for example, that there should be more no grade courses at universities, 
that students should umomze m order to gam a more powerful voice 
m runmng the umversity, that there should be courses in the use and 
control of hallucinatory drugs, and that there should be courses in sex 
education For half the subjects a condition of high importance was 
created by informing them that their choices would be taken into con 
sideration by the administration in selecting one idea to be instituted 
at Boston University the following year The remaining subjects, con 
stituting a low importance condition, were simply instructed to choose 
one of the two proffered alternatives, but no information was conveyed 
to them which would have implied that their decisions had any practical 
importance Within each of these treatments, attractiveness was manipu 
lated by having approximately half the subjects choose between alterna 
fives that had earlier been rated either close together (high relative 
attractiveness of the unchosen alternative) or far apart (low attrac 
tiveness ) 

As predicted by cognitive dissonance theory, there was a greater 
spreadmg apart of the choice alternatives when the subjects re evaluated 
their importance under conditions of high versus low manipulated impor 
tance (p = n) and high versus low attractiveness (p < 001) This 
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first dependent vanable, however— the spreading apart of the choice 
alternatives after the subject had irrevocably decided to choose one 
of the two ideas — was not directly influenced by the volunteer vana e 
nor by any significant interaction of volunteenng and either decisional 
importance or attracbveness Clearly, then, volunteenng does not con 
sistentl) interact with other independent variables to affect expenmen a 
outcomes It would appear that the deception may have funcfaone 
in a similar manner as theorebcally the two sided communicabon 
m the preceding expenment, in effect removing or disguising eman^ 
characteristics that might otherwise favor one direction of response ove 
another ^ 

A second dependent vanable, ratings of the survey m Phase . 
employed by Marmer as a check on the success of the manipu a 
of perceived importance Somewhat surpnsmgly, volunteers saw e 
ton University survey as less important than did the nonvo 
addition, volunteer status showed a tendency (p < 08) to intera^ ' 
the expenmental manipulation of the importance of the subjects 
decisions Nonvolunteers were more affected than volunteers y 
mampulation, a finding that may weaken somewhat our hypo 
volunteers are more sensitive and accommodating to the perceiv 
mand characteristics of the situabon 


VII, CONCLUSIONS 

We began this chapter with McNemar’s lament that ours is a 
of sophomores We conclude this chapter with the question o 
McNcmar was too generous Often ours seems to be a ^vho 

those sophomores who volunteer to participate m our researc 
also keep their appointment with the investigator Our act 

chapter has been to summanze what has been learned a ou jg 
of volunteenng and the more or less stable characteristics o t jn 

who are likely to find their way into the role of data con n 
behavioral research Later in the chapter we considered the 
of volunteer bias for the respresentaliveness of descriptive s • 
for the nature of the relationships found bebveen two or more 
in behavioral research deternuned 

The act of volunteenng was viewed as a nonrandom specific 

in part by more general situational vanables and m part by 
personal attnbutes of the person asked to participate as ^ ^5 m 

havioral research More general situational variables pos u 
creasing the likelihood of \olunteenng included the following 
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1 Having only a relatively less attractive alternative to volunteenng 

2 Increasing the intensity of the request to volunteer 

3 Increasing the perception that others in a similar situation would 
volunteer 

4 Increasing acquaintanceship with, the perceued prestige of, and 
liking for the experimenter 

5 Havmg greater intnnsic interest in the subject matter being 
investigated 

6 Increasing the subjective probability of subsequently being favor 
ably evaluated or not unfavorably evaluated by the experimenter 

On the basis of studies conducted both in the laboratory and in the 
field, it seemed reasonable to postulate with some confidence that the 
following charactenstics would be found more often among people who 
volunteer than among those who do not volunteer for behavioral 
research 

1 Higher educational level, 

2 Higher occupational status, 

3 Higher need for approval, 

4 Higher mtelhgence, 

5 Lower authoritarianism 

With less confidence we can also postulate that more often than non 
volunteers, volunteers tend to be 

6 More sociable, 

7 More arousal seeking, 

8 More unconventional, 

9 More often firstborn, 

10 Younger 

Two additional and somewhat more comphcated relationships may 
also be postulated (a) In survey type research volunteers tend to 
be better adjusted than nonvolunteeis, but m medical research vol 
nnteers tend to be more maladjusted than nonvolunteers (b) For 
standard tasks women tend to volunteer more than men, but for unusual 
tasks women tend to volunteer less than men These more complicated 
relationships illustrate the hkehhood that there may often be variables 
that complicate the nature of the relationship between the act of volun 
teenng and vanous personal characteristics Two such moderating van 
ables appear to be the sex of the subject and the nature of the task 
for which volunteenng is requested 

Our survey suggests that those who volunteer for behavioral research 
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often differ m significant ways from those who do not volunteer Most 
of the research that is summarized here tends to underestimate the effect 
of these differences on data obtained from volunteer subjects In most 
of the studies, comparisons were made only between those who 
that they would participate as research subjects versus those who mdi 
cated that they would not However, there is considerable evidence 
to suggest that of those who volunteer, a substantial proportion wi 
never contribute their responses to the data pool The evidence sugges 
that these ‘ no shows’ are more like nonvolunteers than they are i e 
the volunteers who keep their appointments Therefore, companng non 
volunteers with verbal volunteers is really companng nonvolunteers in 
some other nonvolunteers mixed m unknown proportion with true vo 
teers Differences found between nonvolunteers and verbal vo un ee ^ 
will, therefore, underestimate differences between those who o a 
do not, contribute data to the behavioral researcher 

To the extent that true volunteers differ from nonvolunteers, t e e ^ 
ployment of volunteer samples can lead to senously biased es im 
of various population parameters In addition, however, there i 
possibility that volunteer status may interact with experimental vana 
in such a way as to increase the probability of inferential errors ° 
first and second kind The direct empirical evidence at this time is 
scanty and equivocal but there are indirect, theoretical consi 
that suggest the possibility that volunteers may more often than n 
unteers provide data that support the investigator's hypothesis 


REFERENCES 

Abeles N , Iscoe, I and Brown W F Some factors influencing the 
of college students Public Opinion Quarterly, 1954-1955, 18, 419-4 
Altus W D Birth order and its sequelae Science, 1966, 151, 44-49 nectancy 
Aronson E Carlsmilh, J M and Darley, J M The effects o ^ psy 
volunteenng for an unpleasant experience Journal of Abnorma 
chology, 1963, 66, 220-224 „ 115-124 

Beach, F A The snark was a boopim American Psychologist, ^ 

Beach F A Experimental investigations of species speafic be avio 

Psychologist, 1960 15, 1-18 7 S 0 Waif® 

Bean W B The ethics of expenmentabon on human beings n j>jeiv 
and A P Shapiro (Eds ), The clinical evaluation of new rugs 
Hoeher Harper, 1959 76-84 . volunteer bia* 

Bell, C R Psychological versus sociological vanables m studies o 

in surveys /oumal o/ Applied Psychology, 1961, 45, 80-85 i studies 

Bell, C R Personality charactensbes of volunteers for psychologica 
Journal of Social and Clinical Psycholo^, 1962, 1, 81-95 



THE VOLUNTEER SUBJECT 


113 


Belson, W A Volunteer bias in test-room groups Public Opinion Quarterly, 1960, 
24, 115-126 

Bennett, Edith B Discussion, deasion, commitment and consensus in “group deci- 
sion” Human Relations, 1955, 8, 251-273 

Benson, L E Mail surveys can be valuable Public Opinion Quarterly, 1946, 10, 
234-241 

Benson, S , Booman, W P , and Clark, K E A study of interview refusal Journal 
of Applied Psychology, 1951, 35, 116-119 

Blake, R R , Berkowitz, H , Bellamy, R Q , and Mouton, Jane S Volunteenng 
as an avoidance act Journal of Abnormal and Social Psychology, 1956, 53, 
154-156 

Boucher, R G , and Hilgard, E R Volunteer bias in hypnotic experimentation 
American Journal of Clinical Hypnosis, 1962, 5, 49-51 
Brady, J p , Levitt, E E , and Lubin, B Expressed fear of hypnosis and volunteering 
behavior Journal of Nervous and Mental Disease, 1961, 133, 216-217 
Bnghtbill, R , and Zamansky, H S The conceptual space of good and poor hypnotic 
subjects a preliminary exploration International Journal of Clinical and Experi- 
mental Hypnosis, 1963, 11, 112-121 

Brock, T C , and Becker, G Birth order and subject recruitment Journal of Social 
Psychology, 1965, 65, 63-66 

Brower, D The role of incentive in psychological research Journal of General 
Psychology, 1948, 39, 145-147 

Brunswik, E Perception and the representative design of psychological experiments 
Berkeley University of California Press, 1956 
Burchinal, L G Personality characteristics and sample bias Journal of Applied 
Psychology, I960, 44, 172-174 

Capra, P G , and Dittes, J E Birth order as a selective factor among volunteer 
subjects Journal of Abnormal and Social Psychology, 1962, 64, 302 
Christie, R Experimental naivete and experiential naivete Psychological Bulletin, 
1951, 48, 327-339 

Clark, K E et al Privacy and behavioral research Science, 1967, 155, 535-538 
Clausen, J A, and Ford, R N Controlling bias in mail questionnaires Journal 
of the American Statistical Association 1947, 42, 497-511 
Cochran, 'W G , Mosteller, F , and Tukey, J W Statistical problems of the Kinsey 
report Journal of the American StattsUcal Association, 1953, 48, 673-716 
Coffin, T E Some conditions of suggestion and suggestibility Psychological Mono- 
graphs. 1941, 53, No 4 (Whole No 241) 

Crowne, D P , and Marlowe, D The approval motive New York Wiley, 1964 
Dittes, J E Birth order and vulnerability to differences m acceptance American 
Psychologist, 1961, 16, 358 (Abstract) 

Edgerton, H A. Bntt, S H , and Norman. R D Objective differences among 
various types of respondents to a mailed questionnaire American SoctoJogtcal 

Review, 1947, 12, 435-444 

Edwards, C N Charactenstics of volunteers and nonxolunteers for a sleep and 
hypnotic experiment American Journal of Clinical Hypnosis, 1968, 11, 26-2 
Esecover, H , Malitz, S , and Wilkens, B Clinical profiles of paid normal subjects 
volunteenng for hallucinogenic drug studies Amcricon Journal of Psychiatry, 
1981, in, 910-915 

osier, R J Acquiescent response set as a measure of acquiescence Jouma of 
Abnormal and Social Psychology, 1961, 63, 155-160 



114 


ROBERT ROSENTHAL AND HALPH L. ROSNOW 


Franzen, B , and Lazarsfeld, P F Mail queshonnaire as a research problem lourttd 
of Psychology, 1945, 20, 293-320 
French, J R P Personal communrcation August 19, 1963 

Frey, A H , and Becker, W C Some personality correlates of subjects who tail 
to appear for expenmental appointments Journal of Consulting Psyc 0 gy, > 

Frye! R L , and Adams, H E Effect of the volunteer variable on leaderless group 
discussion experiments Psychological Reports, 1959, 5, 184 
Gaudet, H , and Wilson, E C Who escapes the personal investigator? Jo 
of Applied Pstjchology, 1940, 24, 773-777 
Green, D R Volunleenng and the recall of interrupted tasks Journal of Ab 
and Social Psychology, 1963, 66, 397—401 . 

Greene, E B Abnormal adjustments to experimental situations Psychologtca 
tm, 1937, 34, 747-748 (Abstract) . 

Hayes, D P, Meltzer, L. and Lundberg, Signe Informahon distribution, 
dependence, and activity levels Soctometry, 1968, 31, 162-179 anxiety 

Heilizer, F An exploration of tfie relahonsbip between 

and/or neurohcism Journal of Consulting Psychology, 1960, 24, 

Hilgard, E R Personal communication February 6, 1967 „ 

Hilgard, E R. Weitzenhoffer, A M, Landes, J, and Moore, yjjng 

distribution of susceptibility to hypnosis in a student population a s / ^ 

the Stanford Hypnotic Suscephbihty Scale Psychological Monogrop » 

75, 8 (Whole No 512) . jio 

Himelstein, P Taylor scale characteristics of volunteers and nonvolunteers w P ; 
logical expenmenls Journal of Abnormal and Social Psychology, • 
138—139 , decision 

Hood T C The volunteer subject patterns of self*presentation an t e ^ dies** 
to parbcipate m social psychological experiments Unpublished mas 
Duke University, 1963 . . the 

Hood T C , and Back, K W Patterns of self disclosure and SociO’ 

decision to participate in small groups experiments Paper read at Sou 
logical Society Atlanta, Apnl 1967 , ^onvolun 

Howe, E S Quanhtahve motivabonal differences between volunteers an 44 ^ 

teers for a psychological expenmenl Journal of Applied Psyc 0 ogy, 

115-120 rddes(Ed)- 

Hyman, H, and Sheatsley, P B The saenbfic method 7n D P ^ 1954, 

An Analysts of the Ktnscy Reports New York New American ’ 

®^11® , , u.ch influence 

Jackson, C W , and Pollard J C Some nondepnvation variables w 
the “effects’ of expenmental sensory depnvation Journal of Abnorma 
1966 71, 383-388 ^ Jaboia* 

Kavanau, J L Behavior confinement, adaptabon, and compulsory regi 

tory studies Science, 1964, 143, 490 7 iSS 1623-1®^^ 

Kavanau,] L Behavior of capbve while-fooled truce Science, 

King, A F Ordinal position and the Episcopal Clergy Unpu s 

thesis. Harvard University, 1967 » ^ selecU'’® 

Kruglov, L P , and Davidson H H The willingness to be mtemewe 

factor in sampling Journal of Soettd Psychology, 1953, 38, contnbute W * 

Larson, R F , and Cation, W R , Jr Can the mail back bi^ co 
study’s vahdity? American Sociological Review, 1959, 24, 243- 



THE VOLUNTEER SUBJECT 


115 


Lasagna, L , and von Felsinger, J M The volunteer subject in research Science, 
1954, 120, 359-361 

Leipold, W D , and James, R L Characteristics of shows and no shows in a 
psychological experiment Psychological Reports, 1962, 11, 171-174 
Levitt, E E , Lubin, B , and Brady, J P The effect of the pseudovolunteer on 
studies of volunteers for psychology expemnents Journal of Applied Psychology, 
1962, 46, 72-75 

Levitt, E E , Lubin, B , and Zuelcerman, M Note on the atbtude toward hypnosis 
of volunteers and nonvolunteers for an hypnosis experiment Psychological Reports, 
1959, 5, 712 

Levitt, E E , Lubin, B , and Zuckerman M The effect of incenhves on volunteenng 
for an hypnosis experiment International Journal of Cltmcal and Experiment^ 
Hypnosis, 1962, 10, 39-41 

Locke, H J Are volunteer interviewees representative^ Social Problems, 1954, 1, 
143-146 

London, P Subject characteristics in hypnosis research Part I A survey of expen- 
ence, interest, and opimon International Journal of Clinical and Experimental 
Hypnosis. 1961, 9, 151-161 

London, P , Cooper, L M , and Johnson, H J Subject characteristics in hypnosis 
research II Attitudes towards hypnosis, volunteer status, and personality mea« 
sures III Some correlates of hypnohe susceptibility International Journal of 
Clinical and Experimental Hypnosis, 1962, 10, 13-21 
London, P , and Rosenhan, D Personality dynamics Annual Review of Psychology, 
1964, 15, 447-492 

Lubin, B , Brady, J P , and Levitt, E £ A companson of personality charactenshes 
of volunteers and nonvolunteers for hypnosis experiments Journal of Cltmcal Psy- 
chology, 1962, 18, Zil-343 (a) 

Lubin, B , Brady, J P , and Levitt E E Volunteers and nonvolunteers for an 
hypnosis experiment Diseases of the Nervous System, 1962, 23, 642-643 (b) 

Lubin, B , Levitt, E E , and Zuckerman, M Some personality differences between 
responders and nonresponders to a survey quesbonnaire Journal of Consulting 
Psychology, 1962, 26, 192 

Luchins, A S Pnmacy-recency in impression fonnabon 7n C I Hovland et al. 
The order of presentation in persuasion New Haven Yale University Press, 
1957, 33-61 

Manner, Roberta S The effects of volunteer status on dissonance reduebon Unpub- 
hshed master’s thesis, Boston Umversity, 1967 
Marbn, R M , and Marcuse, F L Charactenshes of volunteers and nonvolunteers 
for hypnosis Journal of Chnicol and Expcnmeniol Hypnosis, 1957, 5, 176-180 
Marbn, R M, and Marcuse, F L Characlerisbcs of volunteers and nomolunteers 
in psychological expenmentabon Journal of Consulting Psychology, 1958, 22, 
475-479 

Maslow, A H Self-esteem (dominance feebngs) and sexuality in ^vomcn Journal 
of Social Psychology, 1942, 16, 259-293 

Maslow, A H , and Sakoda, J M Volunteer error in the Kinsey stud) Journal 
of Abnormal and Social Psychology, 1952 47, 239-262 
Matthj'sse, S W Diffcrcnbil effects of religious communfeabons Unpublished doc- 
toral dissertabon. Harvard Umi'crsity, 1966 
McDa\id, J W. Approval seeking motivabon and the Noluntcer subject Journal 
of PcTsonahty and Social Psychology, 1965,2, 115-117 



116 


ROBERT ROSENTHAL AND RALPH L ROSNOW 


McNemar, Q Opinion athtude methodology Psychological Bulletin, 1946, 43, 

289—374 iQAA tw 

Miller, S E Psychology expenments widiout subjects consent Science, laoD, ^ 


Myfrs T I, Murphy, D B, Simth. S. and Goffard, S J Expenmental sluie! 
of sensory deprivation and social isolahon Technical Report 6&-o, on ac 
44-188-ARO-2, HumRRO, Washington, DC George Washington Univer ty. 


Newman, M Personality differences between colunieers and nonvolunteers 
chological investigations (Doctoral dissertation. New York 
of Education) Ann Arbor, Mich Umversity Microfilms, 1956, No , 
Norman, R D A review of some problems related to the mail que^nnaire 
nique Educational and Psychological Measurement, 1948, 8, 235-247 
Ora J P , Jr Characteristics of the volunteer for psychological invesbgafaow 
cal Report, No 27, November, 1965, Vanderbilt Umversity, Contract Non 

Ora, J P , Jr Personality characteristics of college freshman volunteers for ps^ycbo 
logical experiments Unpubhshed master’s thesis, Vanderbilt 
Orlans, H Developments m federal policy toward university research etc 

155, 665-668 ^ students 

Pace, C R Factors influencing questionnaire returns from former university 

Journal of Applied Psychology, 1939, 23, 388-397 nuesbon 

Pan, Ju Shu Social characteristics of respondents and 

naire study of later matunty Journal of Applied Psychology, 1951, > „evchiatnc 

Perhn S, PoUin W, and Butler, R N The experimental subject 1 f 
evaluation and selechon of a volunteer population American Me tea 
Archives of Neurology and Psychiatry, 1958, 80, 65-70 volunteers 

Polhn, W , and Perhn, S Psychiatric evaluation of ‘ normal con 

American Journal of Psychiatry, 1958, 115, 129-133. >,-lor’s thesis 

Poor, D The social psychology of questionnaires Unpubhshed bac 

Harvard University, 1967 nding * 

Reuss, C F Differences between jpersons responding and 

mailed questionnaire Amencan Sociological Review, 1943, 8, 43 ^ on a dnig 

Richards, T W Personality of subjects who volunteer for researc 

(mescaline) Journal of Projectwe Techniques, 1960,24, ^2,4-42,8 , 1959> 

flichter, C P Rats, man, and the welfare slate American Psyc o 

1 vchology 

Riecken, H W A program for research on expenments m ^ York 

N F Washbume (Ed ), Decisions, values and groups o 


Pergamon, 1962, 25--11 volunteers 

Riggs, Margaret M , and Kaess, W Personahty differences e 

and nonvolunteers Journal of Psychology, 1955, 40, 229-245 2", 

Robins, Lee N The reluctant respondent Public Opinion Qva 

276-286 1966 

Rokeach, M Psychology expenments without subjects consent cien . 


for psychoJog*'--' 

Rosen, E Differences between volunteers and non volunteers 

studies Journal of Applied Psychology, 1951, 35, 185-193 volun*^'^"^ 

Rosenbaum, M E The effect of stimulus and background 118-121 

response Journal of Abnormal and Social Psychology, 1 • ’ 



THE VOLUNTEER SUBJECT 


117 


Rosenbaum, M E , and Blake, R R Volunteering as a function of field structure 
Journal of Abnormal and Social Psychology, 1955, 50, 193-196 
Rosenhan, D On the social psychology of hypnosis research 7n J E Gordon 
(Ed ), Handbook of chntcal and expenmental hypnosis New York Macmillan, 
1967, 481-510 

Rosenthal, R The volunteer subject Human Relations, 1965, 18, 389-406 
Rosenthal, R Experimenter effects m behavioral research New York Appleton 
Century Crofts, 1966 

Rosnow, R L , and Rosenthal, R Volunteer subjects and the results of opinion 
change studies Psychological Reports, 1966 19, 1183-1187 
Rosnow, R L, and Rosenthal, R Unpublished data (described m this chapter), 
1967 

Ruebhausen, O M , and Bnm, O G Pnvacy and behavioral research American 
Psychologist, 1966, 21, 423-437 

Schachter, S The psychology of affiliatton Stanford, Calif Stanford Umversity 
Press. 1959 

Schachter, S , and Hall, R Group denved restraints and audience persuasion Human 
Relations, 1952, 5, 397-406 

Scheier, I H To be or not to be a guinea pig preliminary data on anxiety and 
the volunteer for experiment Psychological Reports, 1959, 5, 239-240 
Schubert, DSP Arousal seeking as a motivahon for volunteenng M\1PI scores 
and central nervous system stimulant use as suggesti\ e of a trait Journal of Projec- 
tive Techniques and Personality Assessment, 1964, 28, 337-340 
Schultz, D P Birth order of volunteers for sensory restriction research Journal 
of Social Psychology, 1967, 73, 71-73 (a) 

Schultz, D P Sensation seeking and volunteering for sensory deprivation Paper 
read at Eastern Psychological Association, Boston April, 1967 (b) 

Schultz, D P The volunteer subject m sensory restncUon research Journal of 
Social Psychology. 1967, 72, 123-124 (c) 

Shuttleworth, F K Sampling errors involved m incomplete returns to mail question- 
naires Psychological Bulletin, 1910, 37, 437 (Abstract) 
fregman, A Responses to a personafit) quesOonmirc 6y voftintcers and nonvoAinfeers 
to a Kinsey interview Journal of Abnormal and Social Psychology, 1930, 52, 
280-281 

Silverman, I Note on the relationship of self-esteem to subject self selection Percep- 
tual and Motor Skills, 1964, 19, 769-770 

Smart, R C Subject selection bias in psychological research Canadian Psychologist, 
1060. 7n, 115-121 

Stnnton, V Notes on the validity of mail questionnaire returns Journal of Applied 
Psychology, 1939, 23, 05-191 

Staples, F R , and Walters, R H Anxiety, birth order, and susceptibility to social 
infliicnce Jouniaf of Abnormal and Social Psychology, IDGl, 62, 710-719 
Suchman, E, and McCandlcss, B WTio answers questionnaires’ Journal of Applied 
Psychology, 1940. 24, 75S-7G9 

Suetlfcld, P Birth ordir of volunteers for sensory deprivation Journal of Abnormal 
end Social Psychology, lOG-l, 68, 105-190 

Varela. J A A cross-cultural replication of on experiment Involving birth order 
Journal of Abnormal end Social Psychology, 1001 69, 450—157 
Wallin, P. Volunteer subjects as a source of sampling bias American Journal 
of Sociology, 1919, 54. 539-51 1 



118 


ROBERT ROSENTHAL AND RALPH L BOSNOW 


Ward, C D A further examinafaon of birth order as a selective factor among 
volunteer subjects Journal of Abnormal and Social Psychology, 1964, 69, 311-313 
Warren J R Birth order and social behavior Psychological Bulletin, 1966 65, 
38-49 

Weiss, J M, Wolf, A, and Wiltsey, R G Birth order, recmtment conditions 
and preferences for partiapahon jn group versus non-group experiments American 
Psychologist, 1963, 18, 356 (Abstract) 

Wicker, A W Requirements for protecting privacy of human subjects some imp ca 
tions for generahzabon of researdi findings American Psychologist, 1968 > 

70-72 

Wilson, P R , and Patterson, J Sex differences in volunteering behavior Psyc o g» 
cal Reports 1965, 16, 976 

Wolf, A , and Weiss, J H Birth order, recruitment condibons, and volunteenng 
preference Journal of Personalitij and Social Psychology, 1965, 2, 269- 
Wolfensberger, W Ethical issues in research with human subjects Science, > 
155, 47-51 

Wolfgang, A Sex differences in abstract abihty of volunteers and nonvo nn ee 
for concept learning experiments Psychological Reports, 1967, 21, 509-512 
Wolfle, D Research with human subjects Science, 1960, 132, 989 
Wnghtsman L S Predicting college students* participation in required psycho gj 
experiments American Psychologist, 1966, 21, 812-813 , 

Zamansky, H S , and Bnghtbill, R F Attitude ifferences of volunteers and nonv ^^ 

teers and of susceptible and nonsusceptible hypnobc subjects Internationa o 
of Clinical and Experimental Hypnosis, 1965, 13, 279-290 
Zimmer, H Validity of extrapolabng nonresponse bias from mail quo® ® 
follow-ups /ournaf 0 / Applied Psychology, 1956, 40, 117-121 . 

Zuckerman, M , Schultz, D P . and Hopkins T R Sensabon seeking and 
mg for sensory deprivation and hypnosis experiments Journal of Consu ing 
chology, 1967, 31, 358-363 



Chapter 4 


PRETEST SENSITIZATION* 


Robert E. Lana 
Temple University 


When Max Planck sought an explanation for heat radiating from a 
black body at high temperatures, he focused not upon radiation per 
se but upon the radiating atom and thus became one of the men most 
responsible for beginning a line of thought and research which ended 
in the formulation of quantum theory. The development of quantum 
theory eventually led to major reconceptions in the field of physics, 
which challenged the Newtonian models that were then predominant. 
The ^vave-particle controversy was recognized as a result of the work 
of Schrbdingcr and others, and this provided a context for an interpreta- 
tion of quantum theory and, within that context, for the recognition 
of what was called tlic principle of indclcrminancy. 

It is possible to speak of t!»e position and the velocity of an electron 
as one would in Newtonian mechanics, and one can obscrx'c and measure 
both of these quantities. However, one cannot determine both quantities 
simultaneously with a limitless degree of accuracy. Relations between 
quantities such as these arc called relations of uncertainty, or indc- 
tcrminac)'. Similar relations can be formulated for other experimental 
situations. Tlic nxivc and particle theories of radiation — two complcmen- 
tar)' explanations of the same phenomenon— were interpreted in such 
a manner. Tlicre were limitations to the use of botli the wave and the 

• Ntany of Uw sUidio* dono by iW owlhor and reported In tUU claptrr urre 
ttipjV7rt«l bv the National Imtitutc of Mcntat Ilralth, Unltrtl Statrt I*iib!jc Hrallh 
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particle concept These limitations are expressed by the uncertamty rela 
tions, and hence any apparent contradiction between the bvo mterpreta 
tions disappears 

The idea of uncertainty in physics can best be illustrated by a 
Gedanken (theoretical) expenment given by Heisenberg (1958, 47) 
"One could argue that it should at least be possible to observe the 
electron m its orbit One should simply look at the atom through a 
microscope of a very high resolving power, then one would see the 
electron moving m its orbit Such a high resolving power could to e 
sure not be obtained by a microscope usmg ordinary light, since e 
inaccuracy of the measurement of the position can never be sma er 
than the wave length of the light But a microscope using gamma 
with a wave length smaller than the size of the atom would do ® 

position of the electron will be known with an accuracy given by t ^ 
wave length of the gamma ray The electron may have been practici ) 
at rest before the observaUon But in the act of observation [ita ^ 
mine] at least one hght quantum of the gamma ray must have 
the microscope and must first have been deflected by the electron There 
fore, the electron has been pushed by the hght quantum, it has change 
its momentum and velocity, and one can show that the uncertainty 
of this change is just big enough to guarantee the vahdity of 
tainty relations ’ It is evident from this Gedanken experiment mat 
very act of measurement negated the possibility of observing the p 
nomenon as it would have occurred had it not been observed 
important to note, however, that we are dealing wth phenomena^^ 
the hmits of physical existence, namely those of sub atomic p 
Measurement of physical activity farther from this hmit (or away ^ 
the limit of infinite space and time at the other end of the 
IS not so sensitive to the influence of the measunng instrument or ^ 
mque (as, for example, when one measures the speed and posi ^ 
a freely falling object at sea level) Heisenberg states, The 
device deserves this name only if it is in close contact \vit 
of the world if there is an interaction beUveen the device 
observer If the measunng device would be isolated from 
of the world, it would be neither a measunng device nor cou 
descnbed m the terms of classical physics at all " ^ 

One of the implications for life sciences of the interpretation ® 
turn theory through appeal to uncertainty relations has noted 

out by Neils Bohr (discussed by Heisenberg 1958 104-1 Oo) ^ pur 
that our knowledge of a cells being alive may be dependent 
complete knowledge of its molecular structure Such a the 

edge may be achievable only by operations which wou ^ 
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life of the cell It is, therefore, logically possible that hfe precludes 
tile complete determination of its underlying physiochemical nature 


I. THE HAWTHORNE STUDIES 

Beginning in 1927, Mayo, Roethlisberger, Whitehead and Dickson 
(Roethhsberger and Dickson, 1939) began a senes of studies in the 
Hawthorne plant of the Western Electnc Company That senes not 
only launched modem industnal psychology on its cunent path, but 
also introduced the idea that the process of measurement in social psy- 
chological situations can influence what is being measured and change 
its characteristics For our purposes the most pertinent results of these 
studies are those which are perhaps most general The onginal aim 
of the studies was to examme the effects on production of such work 
conditions as illumination, temperature, hours of work, rest penods, wage 
rate, etc Six female workers were observed The interesting result was 
that their production increased no matter what the manipulation 
Whether hours of work or rest penods were increased or decreased, 
production always increased The reason given by the authors for this 
effect was that ie women felt honored at being chosen for the expen- 
ment They felt that they were a team and worked together for the 
benefit of the group as a whole What I wish to emphasize is that 
from the point of view of the expenmenter the fact of measurement 
changed not only the magnitude of the dependent variable (rate of 
production), but the very nature of the social situation as well 

The pnnciple of indeterminacy found in the physical situation is, at 
least analogously, operating in this social psychological situation One 
finds a definite relationship between the observational process of the 
expenmenter and the natural process of the subject If we narrow the 
context of our inquuy' to the effect of any specific device designed to 
measure some relevant charactenslics of the organism, we will have 
amvcd at the pnncipal point of departure of this paper 


II. CURRENT METHODOLOGY 

The relevant state of the org.'inism must be determined before an 
expcnmental treatment is applied in much psjcliologicnl rcscarcli Tins 
IS necessary since all ps) chological expenments arc designed to test 
Jin h)polhcsis of change from an initial state of the organism to some 
other state as a result of an experimental treatment Therefore, some 
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assessment of the magnitude of a given variable is necessary pnor to 
the administration of the experimental treatment 

One may legitimately raise the question of why an experimental hy 
pothesis of change needs to be examined by assessing the value o e 
dependent variable prior to treatment through the use of a pretest 
IS certainly possible to substitute a randomizabon design for a pretes 
design By randomly selecting subjects from a defined population an 
by randomly assigning them to the various experimental treatments m 
a given study, one may assume the comparability of these subjects 
Any differences among the scores of the various groups are direct y 
comparable to one another and hence a pretest is unnecessary However, 
there are some reasons why the use of a pretest is preferable to n ran 
domization design Given a constant N, the use of a pretest \vi o en 
increase the precision of measurement by controlling for in ivi ua 
differences within subgroups In addition, should there be a 
of randomization, companson of the subgroups’ pretest means wi 


Of course, it is also possible that a pretest might be 
to detect differences in initial performance, so that the effects o 
expenmental manipulation taking into account these differences c 
examined However, this has not been of typical interest to researc 
utilizing pretest designs in attitudmal studies 
The principal point is that we are interested in demonstrating ^ ^ 
calhj that a given treatment either succeeds or does not 
changing some existing vanable in the organism (such as an opi 
and the most direct way of establishing such a fact is to es 
vanable before and after the application of the treatment mental 

The ideal experiment is one in which the relevant pre 
state of the organism is determined without affecting that sta c 
very measunng process itself Unfortunately, it is rarely jjje 

achieve this aim, since it is almost always necessary to 
environment of the subject in some way m order to obtain * ^ ^ a 
menl However, it is not impossible to do so, as, for and 

situation where the subject is not aware that he is being o se 
his response recorded (cf Campbell, 1967) Several control 
thus various types of pretreatment manipulation of the su jec^ 
ally necessary m most studies The intent of this paper is o ^ 
the nature of these prelreatmcnt measures for sources o ^ ^\ithin 
disrupt the legitimacy of the conclusions it is possible to 
the context of social psychology. . yescarch 

Since sex oral expenmental situations found in tlif 

require some preliminary manipulation involving the su jcc 
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treatment can be applied, appropriate controls are necessary to isolate 
all possible sources of variation Any manipulation of the subject or 
of his environment by the experimenter prior to the advent of the experi- 
mental treatment, which is to be followed by some measure of perfor- 
mance, allows for the possibility that the result is due either to the 
effect of the treatment or to the interaction of the treatment with the 
pnor manipulation A control group is needed to which is presented 
the prior manipulation followed by the measure of performance, without 
the treatment intervening This control group can then be compared 
with the experimental group receiving prior manipulation, treatment, 
and the measure of performance Should there be a significant difference 
between the two groups one may reach a conclusion as to the relative 
effectiveness of the two methods for increasing or decreasing perfor- 
mance This control is diagrammed in Table I 

TABLE I 

Control for Effect of Prior Manipulation 


I 


II 


Prior Manipulation Prior Manipulation 

Treatment 

Measure of Performance Measure of Performance 


TABLE 11 

Controls for Effect of Prior Manipulation and its 

iNTERACnON Wmi THE TREATMENT 


I 


U 


III 


Prior Manipulation Prior IVIanipuIation 

Treatment Treatment 

Measure of Performance Pleasure of Performance Measure of Performance 


Even though the application of this design allows one to make a 
direct companson between groups and yields an evaluation of the effect 
on performance of tlie pnor manipulation and the treatment, it is to 
be noted that there is no logical possibility of evaluating the effect 
of the treatment alone on the measure of performance In order to do 
tins, a second control group must be added TIic revised design is showm 
m Table II 

Tlic second control group (Group III), winch presents the subject 
'vith the treatment and follows this with the measure of perfornnnee. 
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nou permits us to examine not only the effects of prior manipulation 
on the measure of performance, and die combined effects of prior ma 
nipulation and treatment, but also the effect of the treatment alone 
Thus, three possible comparisons may now be made, Group I with Group 
11, Group I with Group III, and Group II with Group III However, 
there is still one source of variation which remains unaccounted or 
m this design Conceivably the performance of the subject alone nug t 
be quite similar in magnitude to his performance under any or a o 
the conditions contained in Groups I, II, and III In order to examine 
this hypothesis, as indicated in Table III, a third and final control group 
must be added to the design * 

TABLE HI 


CONTHOLS FOR Effects of Prior Manimji-atiov, and its Interaction 
Wmi TOE Treatment and Existing Magnitude of Performance 
IN THE Subject 



Prior Manipulation Prior Manipulation 

Treatment Treatment . „ 

Measure of Per- Measure of Per- Measure of Per- Measure o 

formance formance formance form ance ^ 

The design is now complete and all possible effects on the 
vanable of prior manipulation and treatment have been accounte 
Obviously the design can be further complicated if the time 
among the prior manipulation, treatment, and measure of perforTnan 
are varied However, the pnnoplc of control remains the same 
In order to applj the final design presented m Table III» 
assumptions must be made conceming the distribution of su jec 
the four groups involved Subjects must be chosen at random f^ 
population, and randomly assigned to one of the four groups 
sumption is that subjects assigned to any one group will 
m all rclcvTinl charactcnslics, to subjects assigned to any of the ° 
groups A special problem anscs if the prior manipulation happe ^ 
be a test tapping some already existing quality m the 
an opinion questionnaire regarding racial prejudice or a test exa 
achievement in knowledge of Amcncan History, instead of being 
manipulation such as injecting a drug where no measurement is in' 

Since Croups III and IV of Table III are not exposed to pnor manip 

•Solomon (1919), in a now clissic paper, was the first to discuss Ui« 
in lYStcmatic detail 
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tion (eg, an opinion questionnaire) there is no immediate assurance 
that the groups are imtially homogeneous with respect to the opmion 
being tapped Yet, it is necessary to arrange the groups as in Table 
III if one wishes to control for the efiFects of the three elements of 
the design 

There are at least two possible solutions to this dilemma One has 
been suggested by Solomon (1949) Smce Groups I and II have pretest 
measures taken on them, it is possible to calculate the mean value and 
the standard deviation for both groups on their questionnaire scores 
The mean of these means and the combined standard deviation of the 
two groups can then be assigned to Groups III and IV as the best 
estimates of the pretest scores of these groups in heu of administenng 
a questionnaire to them It is then possible to examine the change from 
pretest mean scores to posttest (measurement of performance) mean 
scores for all groups, without actually having applied the pretests to 
Groups III and IV The original means of Groups I and II are used 
in the analysis Degrees of freedom utilized in any tests of significance 
should be Aose appropriate for each of the actual four groups However, 
this method is tenuous if one has httle information as to the comparabil 
ity of the vanous groups of subjects 

An alternative or complementary solution is to examine for comparabil 
ity a large number of subjects from the pool from which the final selec 
tion of experimental subjects will be chosen Thus, should there be avail 
able 500 comparable subjects from which we wish to choose 100 for 
our experiment, then the following procedure would be useful Ran- 
domly choose the 100 subjects to be used in the four groups of the 
expenment Randomly assign 25 subjects to each of the four treatment 
groups Randomly assign t\vo of these four groups to the two pretest 
conditions, and administer the pretest Pretest the remaining 400 people 
in the onginal pool Finally, assign the grand mean and grand standard 
deviation of all 450 pretested subjects to groups III and IV, the unpre 
tested groups (Lana, 1959, discusses this problem and provides a related 
example ) 

Tliere are essentially t^^o types of prior manipulations that are utilized 
m our basic design One type (Case I) x\c may designate as an expen 
mental condition of some sort, such as receiving a jolt of cleclncity, 
swallowing a pill, or pressing a button In this t)pc of manipukition 
the measure of performance (posttest) is always different from the pnor 
manipulation, the txvo ne\cr require the same task from the subject 
Also the prior manipulation is not a measure of performance or of prc\i 
ous condition of tlic organism as it is, eg, when opinion questionnaires, 
spelling tests, etc, arc used Esscntialh, Case I represents the situation 
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where some pre-treatment applied to the subject is necessary in order 
to examine the effects of the pnncipa! treatment Strictly speaking, no 
pretest is involved, but rather a part of the experimental treatment con 
ceptualized as a pre-condition necessary for exammation of the depen 
dent variable In Case II the prior manipulation requires the same kind 
of a performance from the subjects as does the posttest and is actually 
a part of the dependent vanable (pretest-posttest change) With the 
Case I type of pretreatment condition only random assignment of sub 
jects to the initial groups can be used to assure homogeneity of subjects 
Wlien tlie prior manipulation is exactly the same task as the measure 
of performance (Case II), as, for example, when an opinion question 
naire is used for both, it is also possible to estimate pretest measures 
for groups which can not be pretested These procedures, discusse 
above, become extremely important for this type of situation althoug 
irrelevant when the prior manipulation is non mensurative 
most of the situations with which we shall be concerned, Case 
predominates 

III ANALYSIS OF CASE I DESIGNS 

Following Solomon’s article (Solomon, 1949) a good deal of 
his been done in examining the effects on performance of prior 
lation m interaction with a succeeding treatment A recent ^ ^ 
Ross, Krugman Lyerly and Clyde (1962) develops the four group desi^^ 
for use in certain types of psychopharmacological studies and ^ 

an example of our Case I Although there may be instances 
psychopharmacological study might include a pnor manipulation w 
taps an existing attribute of the subject, most pretest-treatment 
tion designs in this area are of the type where the prior manipu a ^ 
IS some cxpenmentil condition and is therefore not a pretest an 
not repeated in the posttest The expenment by Ross et al is o 
latter type and illustrates the proper statistical analysis to be use 
this kind of data The design is contained in Table IV 

TABLE IV 

Design or Expehixient dy Ross, Kiuicmam, Lyerly, and Clyde ( t _ 


Tail, (lappinp) Task 


Vo druK (Placebo) Dnifi 


(disRUiscd) No dmP 
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It IS to be noted that this design fulfills exactly the conditions sum 
manzed in Table III for permitting the assessment of prior mampula 
tion treatment interactions, the eflPects of prior manipulation alone, and 
the effects of task alone It is also to be noted that the measure of 
performance (a tapping task) is different from the prior manipulation 
which IS the swallowing of a pill Since the prior manipulation m this 
case is not designed to tap any existing attnbute of the individual, the 
usual random assignment of subjects to the various groups should be 
utilized 

A double classification analysis of vanance is the proper method of 
analysis in examining the various mam and interaction effects The mam 
effects for drug and pill are examined against the mean square for error 
as IS the interaction mean square Degrees of freedom and appropriate 
ness of error term are determined as in an ordinary double classification 
analysis of vanance A significant F*ratio for drug would suggest that 
the drug treatment significantly affected the performance of the task 
A significant main effect for pill would suggest that the prior mampula 
tion (actually givmg a placebo m the form of a pill) significantly affected 
the performance of the task A significant interaction effect would indi 
cate that the effect of the pnor manipulation and the treatment taken 
together affected the task and, therefore, mam effects if significant would 
become more complex to interpret Actually, if the mam effect for drug 
were significant in our example, and the interaction between drug and 
pill were also significant, but not the mam effect for pill, then depending 
upon the shape and magnitude of the interaction the following inter 
pretabon might be made The drug, in itself, is powerful enough to 
have an effect on task performance regardless of other factors in the 
experiment However, if the drug were taken in pill form this factor 
would also have a significant effect on the performance of the task 
Obviously, any theorebcal interpretation of these results would have 
to await further research Should the interaction alone be significant, 
the interpretabon would be more diflScult This situation is discussed 
below 


IV. ANALYSIS or CASE II DESIGNS 

Withm the framework of the pretest treatment postlcst design (ic, 
poor manipulation treatment measure of performance) a subject's initial 
response to a questionnaire mav provide a basis of companson vMlIi 
later questionnaire responses, and thus a positive or negative correlation 
of some magnitude might bo expected between the individuals first 
and second scores to the same questionnaire This expectation of corrcIa 
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tion between successive, similar tasls is the basis for methodological 
concern with repeated measurements designs 

When two treatments are performed in succession, a treatment carrj 
over ma) occur The rotation or counterbalanced design (Cochran and 
Co\, 1957) IS specifically intended to give information on such a treat 
ment carryover In the experimental situations described in this chapter, 
where a single pretest precedes a single treatment, the possible effects 
of carryover from pretest to treatment are the same as the treatment to 
treatment carryover A rotation design is not possible when the first 
variable is a pretest, since by definition a pretest must precede the 
treatment Consequently, any examination of the confounding effects 
of the pretest with the treatment must be made by expenmentally 
manipulating the nature and application of the pretest The major ques 
tion that remains is directed at the nature of the relationship existing 
between these two variables Thus the pretest-treatment-posttest rcsearc 

design IS a special case of the general repeated measures design \\hcrc 
there are multiple treatments or tests of the same organism over time 
Ordinarily in a pretest posttest design, since the initial score on t c 
pretest has a tendency to be variable over subjects, a covanance design 
with pretest score as the covariable is appropriate and very useful Ho\' 
c\cr, as wc have seen, the very nature of our interest in pretest 
tion disatlous for the use of covanance, since, in order to fulfil ® 
requirement of the four-group design, some groups will not have 
pretested Consequently, the pnncipai design needs to be of nno 

If one can assume that there is a high probability that the unprctcsl ^ 
groups in the four-group control design would not be significantly • ^ 
ent on pretest scores from the groups who were pretested, then a 
by two factonal analysis of vanance can be computed on the pu^ 
scores of the four groups This analysis will yield main effects 
treatment and for pretesting, and a first order interaction term c " 
the two Should the interaction effect not be significant, but 
both of the mam effects be significant, the interpretation is 
ward A significant main effect for treatment indicates that the irca 
affected the posttest score in cither a facihtativc or 
depending on the direction of the mean change scores A sinu 
prctation ma\ be made for a significant pretesting mam effect 
the inlcraclwc effect of pretesting and treatment bo significant, 
less of whether or not the main effects arc significant, 
would become more complicated (sec Lana and Lubin, 

^\^lcn an interaction term is significant it can be concluded ' 
usual mlcrprclalions regarding the data which would ordman 
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low from the hypothesis testing model cannot be made That is, with 
a significant interaction efFect, the experimenter needs to exercise extreme 
care in the interpretation of his data and, in many instances, he may 
have to re-examine the manner m which he has constructed the empirical 
aspects of his problem The significant interaction should lead him to 
reconstruct his hypotheses along somewhat different lines Scheffe con 
eludes, “In order to get exact tests and confidence intervals concerning 
the mam effects it is generally necessary with the fixed effects model 
(but not the random effects model or mixed model) to assume that 
there are no interactions ” All of the studies discussed below use the 
fixed effects model ‘It happens occasionally that the hypothesis of no 
interactions will be rejected by a statistical test, but the hypothesis of 
zero mam effects for both factors will be accepted The correct conclu- 
sion is then not that no differences have been demonstrated If there 
are (any nonzero) interactions there must be (nonzero) differences 
among the cell means The conclusion should be that there are differ- 
ences, but that when the effects of the levels of one factor are averaged 
over the levels of the other, no difference of these averaged effects 
has been demonstrated” (Scheffe, 1959, 94) The point of this discussion 
IS that since we are specifically looking for significant interaction terms, 
if we find one it should act as a warning device The experimenter 
should then reformulate the problem In the pretest-treatment-posttest 
design case, the use of a pretest becomes suspect for the given data 
of the study 


V. SENSITIZATION WHEN THE PREITIST 
INVOLVES LEARNING 

Besides providing one of the first formal analyses of a research design 
capable of structunng an experiment so that a pretest treatment interac 
tion could be isolated and measured, R L Solomon (1949) conducted 
an experiment demonstrating one source of such an interaction effect 
Two grammar school classes were equated for spellmg ability by teacher 
judgment The three group control design that was examined earlier 
was used Each group was pretested on a list of words of equal difficulty 
by having the children spell the words The groups were then given 
a standard spellmg lesson on the general rules of spelling which served 
as the experimental treatment The posltest consisted of the same list 
of words to spell as were used as the pretest In the anal) sis which 
followed, there was some indication that the pretest interacted with 
the treatment although the usual two-wa) analysis of vanance could 
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not be computed It was concluded that the taking of the pretest tended 
to dimmish the spelling effectiveness of the subjects In this study the 
errors made m the pretest somehow were resistant to the treatment 
and were made again during the posttest Here then is an instance 
of a pretest which, at least in part, is a learning experience (or recall 
of already learned material) depressing the effect of the treatment, which 
IS another related learning task 

Beginning with a re examination of Solomon s results, Entwisle ( 1961a, 
1961b) performed two experiments of her own and found a significant 
interaction effect among pretesting, IQ and sex Pretests consisted o 
several multiple choice questions about state locations of large U S 
cities Treatment consisted of showing all subjects a slide with the name 
of a city projected on it for 1 second The subjects then wrote t e 
stale name, and immediately afterward the correct state name was shown 


TABLE V 


Experimental Design of Lana ani> King (1960) 


Group I 

Group II 

Group III 

Group IV 


Heading 

Recall 

Reading 

Recall 

Reading 

Reading 


12 days 

Film 

12 days 

12 days 

Film 

12 days 


Recall 

Recall 

Recall 

Recall 

— 


for 1 second This procedure was repeated for all items m the p|S 
Hence, the treatment consisted of a training session directly re 
to material presented as pretest The posttest consisted of the sa ^ 
procedure as the pretest There was no significant main effect -jj 

mg, but the triple interaction measured above was significant ^hho 
the results are equivocal, there is a suggestion that pretesting n 
recall for high IQ individuals and was "mildly hindering for 
IQ students In another training study, Entwisle (1961a) foun 
mficant interaction effect or direct effect of pretesting 

In 1960, Lana and King using the four-group control design 
m Table V, had all groups read a short summary of the menta 
film on ethnic prejudice, The High Wall” Two of the groups 
asked to recall the summary immediately after the reading as n 
the original as possible by wntmg it out on a sheet of paper 
recall was considered the pretest Sometime later the film was 
to one group (pretested) diat had been asked to recall the so 
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and to another group that had not been asked to recall the summary 
(unpretested). Immediately after presentation of the film all groups 
were posttested by asking tlicm to recall as near to the original as possi- 
ble the summary which had been read to them several days before 
Accuracy of recall was measured by dividing the story into “idea units” 
and counting the number of units in each subjects protocol 

Even tliough the film used as the treatment has a definite attitudinal 
component to it and is clearly didactic, our interest in this study was 
to examine only recall components of the pretest-treatment-posttest ex- 
penmental design The results indicated a significant main effect for 
pretesting and no significant effects for the treatment nor for the pretest- 
treatment interaction Although the content of the summary read before 
the first recall contained no more information than that which could 
be seen and heard in the film, the act of recalling the wntten summary 
was more effective than seeing the film m influencing the precision of 
the second recall taken after the presentation of the film In this case, 
only the fact of the first recall significantly affected later recall The 
combination of first recall and film viewing was not as effective in post- 
test recall To the extent that a pretest serves as a device for conscious 
recall of meaningfully connected material, it can serve to influence post- 
test results Attitude and opinion quetsionnaircs used as pretests might 
have the same effect should part of the process of taking such a pretest 
involve recall of previously held attitudes or opmions 

Hicks and Spaner (1962), working with attitudes toward mental pa- 
tients and hospital experience, found a pretest sensitization effect similar 
to that shown by Lana and King The former investigators suggested 
that a learning factor might have been present in the attitude question 
naire used as the pretest 

In the studies by Solomon, Lana and King, Hicks and Spaner and 
the two by Entwisle, virtually all possible effects of pretesting, when 
pretesting was a learmng or recall device, were shown Solomon found 
a significant interaction between pretest and treatment which had a 
depressing effect on posttest score Lana and King found a significant 
mam effect for pretesting, indicating greater recall with pretesting than 
Without pretesting, but no significant pretest treatment interaction 
Entwisle found a pretest-treatment interaction with a salutory effect 
of the pretest on posttest scores and in another study found no significant 
pretest-treatment interaction of any kind Entwisle dismissed her nega 
tive results because sex was not a control variable in that study, and 
she later showed that when it was introduced as a variable a significant 
pretest treatment-sex interaction appeared Even though we have ex- 
amined only five relevant studies we are probably safe in assuming 
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that a pretesting procedure which, in whole or part, involves some learn 
ing process such as recall of previously learned material, may very well 
have an effect on the magnitude of the posttest score Ordmanly, if 
the task or the recall demanded by the pretest procedure is properly 
understood by the subject, the effect on the posttest should be facilita 
tive However, as we have seen in the case of Solomon’s results and 
some of Entwisle s, depressive effects can also occur 

There are different implications of the interpretations of pretest results 
depending upon whether or not the sensitization is direct (Lana and 
King) or operates m concert with the treatment (Solomon, Entwisle) 
The former results are simpler to interpret and the experimenter need 
not be as concerned with the general procedure of preteshng m that 
experimental situation, since pretesting effects and treatment effects are 
independent When the pretest treatment interaction effect is significant 
there is always the danger that the interaction indicates a change m 
the nature of the empirical phenomenon and a distortion of that pn® 
nomenon so that it is markedly different from what it would have been 
had a pretest not been used It is in this situation of measuring atbtu e 
by use of a pretest that we see an application of the principle of indeter 
minacy which seems to operate in a manner analogous to that oun 
in subatomic physics 


VI SENSITIZATION WHEN THE PRETEST INVOLVES 
OPINIONS AND ATTITUDES 

In Solomons 1949 article, he indicated in a footnote that 
dence was available that the pretest may reduce the variance o 
posttest in attitudmal studies The implication was that taking an a 
tudmal pretest may restrict the attention of the subjects so tha 
are not as variable in then reactions to the treatment as they w ^ 
have been had they not been required to take a pretest In 195 , 

Piers at the suggestion of J C Stanley, used the Solomon toar 
control design to measure teacher attitudes toward students T e p 
consisted of the Minnesota Teacher Attitude Inventory, and the 
used the Adorno et al F Scale a student ratmg scale and a voca 
test No pretest effect of any kind was found Lana (1959a) 
the Solomon design with a questionnaire measunng opinion on 
tion as both the pretest and posttest The treatment consisted o 
pro vivisection appeal If a pretest sensitization were j^^jces 

more likely be evident in this study, where pretest and pontes 
were identical, than in Piers’ study where they were dmeren 
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the same pretest and posttest devices, a recall factor alone should pro- 
duce some effect from pre-to-postlcst, as was noted m the Lana and 
King, Solomon, and Entwisle studies There was, however, no significant 
pretesting main effect nor a significant pretest treatment interaction 
effect Considering the fact that the topic used, “vivisection,” was prob- 
ably of minor interest to the subjects, one cvplanation for these results 
IS that they might have been httlc affected by the tasks asked of them 
Lana (1959b) repeated this study using a topic (ethnic prejudice) 
which, it seemed reasonable to assume, was more interesting and con- 
troversial to the college student subjects than vivisection The treatment 
consisted of “The High Wall,” the mental health film used by Lana 
and King (1960) Pretest and postfesl consisted of a modified version 
of the California Ethnoccntrism Scale There were no significant main 
or interactive effects involving the pretest 

DeWolfe and Govemale (1964) administered to experimental and 
control groups of student nurses a pretest consisting of the Nurse-Patient 
Relationship Sort, Fear of Tuberculosis Questionnaire, and the IPAT 
‘Trait” Anxiety Scale The Nurse-Palient Relationship Sort was given 
as a posttest at the end of a specified nursing training period Appro- 
pnate controls were utilized to allow for an examination of pretest and 
pretest-treatment interaction effects The authors reported that there 
was no consistent sensitization or desensitization as a result of pretesting 
Campbell and Stanley (1966) have indicated that studies by Anderson 
(1959), Duncan et al (1957), Sobol (1959), and Zeisel (1947) also 
reported no sensitizing effect as a result of taking a pretest, when opin 
ions or attitudes were involved 

In 1949, Hovland, Lumsdaine and Sheffield, in their classic work on 
attitudes of the American soldier during World War II, reported that 
something like a sensitization effect occurred as a consequence of using 
a pretest in attitudmal studies They found that there was less attitude 
change in a group of soldiers administered pretest questionnaires on 
topics relevant to the war effort than in those not so pretested where 
all other expenmental conditions were similar By their own admission, 
this conclusion was extremely tenuous since the pretested group con 
sisted of soldiers receiving infantry training at one base while the non- 
pretested group consisted of soldiers receiving armored vehicle traimng 
at another base Also, the demographic characteristics of the two groups 
of men were not comparable This study remains the only one involving 
opinions and attitudes m which even a suggestion of the occurrence 
of pretest sensitization is made and in which a unidirectional communica- 
tion ( 1 e , a communication supporting only one point of view ) was 
used 
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The overwhelming lack of a pretest sensitization effect when the pre 
test IS used to measure existing opinions or attitudes is as convincing 
a demonstration as one is likely to find in social psychological research 
It seems reasonably safe to use a pretest without concern for its direct 
or interaction (with the treatment) effects on posttest results This zero 
effect seems to be present over a large variety of opinions and attitudes 
and a large variety of treatment situations, as is evident from the di 
vcrsity of techniques used in the studies cited here The vast majority 
of these studies utilized a one sided communication geared to influence 
opinion or attitude change in one possible direction The Anderson stud) 
IS an exception, and conceivably in the DeWolfe and Govemale stud) 
the training of the nurses may have contained incidents that represente 
both positive and negative positions about the patient qua patient How 
ever, by and large, the studies involved a unidirectional persuasive at 
tempt as the treatment 

Withm the context of research on order effects m persuasive communi 
cations, it was decided to check for pretest sensitization when 
are exposed to both of two opposed arguments on the same topic This 
would be much the same situation as, for example, being exposed to 
conflicting advertisements for similar products or to apparently contnry 
arguments on political issues by individuals running for the same po 
cal office The presentation of opposed arguments seemed sufficic^ ) 
different from a treatment using a unidirectional communication to^'^^ 
rant a renewed effort in looking for pretest sensitization , , 

In an experiment by Lana and Rosnow (1963), subjects were divioo 
into \anous groups such that half of these groups received a question^ 
naire measuring opinions either on the use of nuclear weapons 
public censorship of wnttcn materials This was accomplished by ci 
handing tlie questionnaire to the subject and asking him to j 

it or interspersing the questionnaire items throughout a regular Psy 
ogy I examination thus hiding” it from the subject When the q»esti 
naire is handed directly to the subject and he is asked 
it, it IS highly likely that his attention will be focused directly 
task ^^^len the questionnaire items are interspersed throughout a ^ 
classroom examination, the attention and expectancies of the 
arc initially on a topic other than the content of the questionnaire 
ecu ably, here uas a way to get a measure of initial opinion on a 
subject matter, and to reduce any effects of pretest sensilizaljo^^^y 
initial purpose in carrying out this study was to examine possib e j 
of pretesting on the order effects (primacy^rccency') of two op^ 
communications Pnmaev refers to the success in changing 
the initial argument of two opposed communications 
to a similar success of the argument presented second DisircS 
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this result, a reanalysis of the data indicated that the average opinion 
change per group (mean absolute differences from pretest to posttest 
regardless of the direction of change) was significantly greater for groups 
where the pretest was hidden than for the groups where the pretest 
was exposed 

The step that next seemed most appropriate was to attempt to demon 
strate this pretest sensitization when two opposed commumcations were 
used as the treatment and when some subjects received no pretest and 
others responded to an exposed pretest Two experiments (Lana, 1964, 
1966) were conducted for this purpose Their results indicated that the 
no pretest groups changed their opinions in either direction to a signifi- 
cantly greater degree than did the groups administered the exposed 
pretest The mean of the unpretested groups was estimated by comput 
uig the mean of the means of die pretested groups Consistent with 
our earlier discussion, it was assumed that the groups not pretested 
were homogeneous with those pretested since they were formed ran 
domly from the same population 

These studies tend to support the notion that the pretest can act 
as a device by which the individual commits himself to maintain his 
opinion in the face of opposed (le, bidirectional) arguments presented 
after he has made his commitment Campbell and Brock (1957) have 
shown that commitment to an attitudmal position inhibits change when 
commitment is elicited after an initial attempt to influence the subject, 
but not when response to a precommunication questionnaire consbtutes 
the commitment Their suggestion, however, is that there are forms of 
attitudmal commitment, usually made under public conditions, which 
inhibit opinion or attitude change as a result of materials presented 
later Almost without exception, however, no pretest mam or interaction 
effect has been found in the situation where only one opinion or attitude 
IS measured by the pretest and where a unidirectional communication 
serves as the treatment Where bidirectional arguments comprise the 
treatment, pretest sensitization has consistently been present (see Table 
VI) A possible explanation for these marked differences is the following 
If the recipient initially favors the position advocated, then a unidirec- 
tional communication should yield greater opinion change, regardless 
of pretest conditions This is because the commumcation would support 
the recipients’ mihal commitment, if no ceiling problem were encoun 
tered If half the subjects supported the position advocated, these sub- 
jects would not need to consider their initial opinion when reacting 
to the posttest * There would be no need to resolve discrepancy bct\\cen 


mean near 


This assumes a symmetrical distnbution of pretest scores \vTlh a 
the indifTcrencc ’ point, an assumption which holds true for the great majontj 
of the studies heretofore cited 
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TABLE VI 


SlTMVlARY OF SENSrnZATIOV EFFECTTS INDICATED BY VARIOUS EXPERIMENTS 



No 

sensitization 

Sensitization 
mam eflect 
of pretest 

Sensitization 
interaction of 

pretest nith 
treatment 

Pretest A 
Learning 
De\ice 

Entwisle (1961a) 

Lana and King (1960) 
Hicks and Spaner (1962) 

Solomon (1949) 
Eotwisle (1961b) 

Pretest An 
Attitudinal 
DeMce (Uni- 
directional) 

Zeiscl (1947) 

Duncan et al (1957) 
Anderson (1959) 
Lana (19o9a) 

Lana (1959b) 

Sobol (l9o9) 
De^olfe and Gov- 
ernale (1964) 

Hovlaod, Lumsdaine, and 
Sheffield (1949) 


Pretest An 
Attitudmal 
Deuce (Bi 
directional) 


Lana and Rosnow (1963) 
Lana (1964) 

Lana (1966) 



what these individuals wrote on their pretests and the point 
represented in the communicabon Both would be consistent w 
another Thus there is no resistance to the communication because ^ 
pnor commitment for half of the subjects However, if two 
arguments were presented as the communication, one of the posi 
would automatical!) be discrepant with every subjects iniha 
ment Pretest commitment, challenged by one of the 
produces resistance to change, and hence the result is a sma er c o 
score from pretest to posttest 


vn CAUTIONS ASSOCIATED mTH THE USE OF A PRETEST 

As we ha\e seen, when the mensurative process invohes 
quircd to respond to an experimental condition in a manner 
their moli\es opinions, attitudes, feelings, or beliefs, the a 
of a pretest to measure these charactcnstics which is free from i 
on that process ma) be difficult to find The subjects 
manipulalorv intent of the expenmentcr dealt with in dctai J 
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(Chapter 2), the expenmenters expectations (Rosenthal, Chapter 6), 
the subjects concern about being evaluated (Rosenberg, Chapter 7)— all 
of these can exert an influence through use of a pretest, or act as separate 
effects, and thereby confound the effect of the experimental treatment 
Indeed all of the other chapters of this book deal with factors which, 
though extnnsic to the experimental situation as conceived by the ex- 
perimenter, can affect the magnitude and quality of the treatment and 
its effect on behavior Campbell and Stanley (1966, 20) have noted, 
In the usual psychological experiment, if not in educational research, 
a most prominent source of unrepresentativeness is the patent artificiality 
of the experimental setting and the students knowledge that he is par- 
ticipating in an expenment For human experimental subjects, a higher 
order problem-solving task is generated, in which the procedures and 
experimental treatment are reacted to not only for their simple stimulus 
values, but also for their role as clues in divining the experimenters 
intent” 

Campbell and Stanley (1966) have also observed that the posltest 
may create an artificial "experiment participating effect” for the subject 
if the connections between treatment and posttest (or among pretest, 
treatment and posttest) are obvious One way that this perception on 
the part of the subject might be changed is by using a different (eg, 
equivalent form) posttest than was used as the pretest With few excep- 
tions (eg, Piers, 1955 ) most of the studies mentioned above used identi 
cal pre- and posttests, that is, they are all examples of what was earlier 
referred to as Case II. 

At this point the alternative to be explored is that of substituting 
for the pretest some other technique of observation or measurement of 
initial standing which lacks the obvious and telltale characteristics of 
die pretest (i e , as in Case I) One alternative to administering a pretest, 
nn alternative which allows a reasonable estimation of the strength of 
a subject’s opinion and attitude toward some social object is the use 
of groups which, because of the unified stand of their members regarding 
die topic in question, are naturally homogeneous For example, one 
might speculate that the opinions concerning birth control of members 
of the Catholic college organization. The Newman Club, would cluster 
m the negative half of the opinion continuum In his examination of 
order effects when opposed communications on the same topic were 
presented to subjects, Lana (1964b) found that intact goups have a 
tendency to be more rigid in their commitment to a given opinion or 
attitude, although these groups are of such a nature that more information 
IS available about their initial opinions which is useful in solving the 
pretest sensitizahon problem It is more difficult to change their opinions 



138 


ROBERT E LANA 


Via a persuasive communication than those of groups formed randomly 
They are. therefore, not the most ideal subjects to use in a demonstration 
of the facility with which opinion or attitude change can be effected 
under various communicative conditions c i, v 

Recently, E J Webb, D T Campbell, R D Schwartz, and L Sechres 
(1966) published a book which included a summary of what may e 
conceived as various alternatives to the pretest questionnaire tec 
It IS their contention that there are "nonreactive measures” which 
be used to determine relative states of the organism, measiues w i^^^ 
assure the experimenter that the organism is not affected by t e ^ 
of measurement itself In short, there are measures which eliminate 
operation of the principle of indeterminacy in the opinion or a u ^ 
measurement situation (Reactive measures are those whic 
the subject to the fact of being measured, or of being an o jec 
concern to the experimenter and which, therefore, serve 
stances to change the behavior of the subject as a result ) T eir P ^ 
IS that for the researcher concerned with social and other comp ev c 
tions affecting the orgamsm, a variety of mensurative techniques 
available which do not interfere with the process being 
virtue of the fact that the subject is totally unaware of the 
process They divide these ‘unobtrusive” measures into five caeg^^^^ 
The first is that of physical traces For example, it is possibe ° 
what displays are most popular at a museum by examining * ® 
of wear of the floor tiles directly in front of the exhibit This is ^ 
reactive measure In contrast, a reactive measure would be o 
number of visitors which exhibits they spent the most time at 
they enjoyed the most Conceivably, an individual asked t is q^^ 
might respond with the name of an exhibit quite at vg subject 

one he actually spent the most time at or most enjoyed 
wished to appear “cultured,” he might say that he spent luo gj^jja 
looking at the fish fossils than he did looking at the stu e 
The second source of nonreactive data is what Webb e o 
“running record,’ where data of a census nature have uhicb 

by society for purposes other than those of the expenmenter, ^ 
provide useful information for him Examples include wouH 

cit) budgets, and voting statistics Episodic and private reco 
serve the same function as running records Simple obser^ * calegor) 
sivc movements and casual conversation comprise anot or 
nonreactive measurement The final type of measure inv o v 
of hidden mechanical or other devices to record behavior i 
where the subject is unaware of the ongoing measurement j},o\vn 
Paying all due respect to the cleverness and imagim * 
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Webb, Campbell, Schwartz and Sechrest m devising and systematizing 
these nonreactive, unobtrusive measures, the strongest impact on this 
writer after reading their book was to reinforce his belief m the necessity 
of looking for ways and devices to utilize reactive measures where the 
reaction (sensitization) on the part of the subject can either be measured 
or be eliminated altogether The unobtrusive measures listed by these 
authors are rarely relevant to the research of a good many psychologists 
who use some sort of pretest measure However, it should be noted 
that the authors intended these techniques to be as much “posttests” 
as “pretests ” Since it has been shown that, at least in attitude research, 
pretest measures, if they have any impact at all, depress the effect being 
measured, any differences which can be attributed to the experimental 
treatment probably represent strong treatment effects In short, when 
pretest measures exert any influence at all in attitude research, the effect 
IS to produce a Type II error, which is more tolerable to most psychologi 
cal researchers than is an error of the first kind 
It would seem that a researchers decision to use a pretest or, instead, 
to utihze a randomization design with only posttest measures is partly 
based upon personal charactenstics having little to do with the logic 
of the expenment As we have indicated, what one gams in information 
by utihzing a pretest he sometimes loses in increased sensitization of 
the subject What he gams in purity of experimental effect by utihzing 
a randomization design he loses in knowledge of pre treatment condi 
tions existing in the organism In some cases the goals of the expenment 
set the risk one will take However, m many situations one is caught 
between the Scylla of sensitization and the Charybdis of ignorance of 
pre-existing conditions The choice of procedure may be arbitrary 
If, however, one does choose to utilize some form of the pretest-post- 
test design, disguising both pretest and posttest as much as possible 
may reduce sensitization of the subject For example, as part of an 
^ yet unfinished master’s thesis, Julian Biller hid an attitudmal pretest 
m a questionnaire ostensibly concerned with student reaction to various 
university administration policies and to student life in general It was 
explained to the students that the information was useful to the instruc- 
tor in shapmg his course toward the needs of the students Since the 
pretest items were concerned with attitudes toward the Amencan college 
grading system, they wer^ not conspicuous by their content In a similar 
manner, the posttest items were hidden in a different questionnaire pre- 
sented to the subjects sometime after they had been exposed to a per- 
suasive communication concerning various grading systems Of course, 
the astute subject may still recognize the key items in the questionnaire 
as being related to the communications he listened to sometime in the 
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past However, recognition and therefore sensitization effects may be 
mimmized by use of this techmque 

Pretest sensitization might also be minimized by increasing the nm 
beUveen apphcation of the pretest and the presentation of the 
communications and the posttest However, one nsls the possibility t a 
factors external to the experimental situation may influence pretes -pos 
test change scores if the interval between the two is great ConceivaD y 
an optimum time interval between pretest and treatment as we a 
between treatment and posttest might be found 
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Special methodological problems are raised when human subjects are 
used in psychological experiments, mainly because subjects' thoughts 
about an expenment may affect their behavior in carrying out the experi- 
mental task 

To counteract this problem psychologists have frequently felt it neces- 
sary to develop mgenious, sometimes even diabolical, techmques in order 
to deceive the subject about the true purposes of an investigation (see 
Stncker, 1967, Strieker, Messick, and Jackson, 1967) Deception may 
uot be the only, nor the best, way of deaLng with certain issues, yet 
we must ask what special charactenstic of our science makes it necessary 
to even consider such techniques when no such need anses in, say, 
physics The reason is plain we do not study passive physical particles 

" The substanhve work reported in this paper was supported in part by Contract 
#Nonr 4731 from the Group Psychology Branch Office of Naval Research The 
research on the detection of deception was supported in part by the United States 
^rmy Medical Research and De\eIopinent Command Contract ^DA-49 193-MD*2647 

H wish to thank Frederick J Evans Charles H Holland, Edgar P Nace, UInc 
Neisser, Donald N O Connell. Emily Carota Ome, Dawd A Paskewlz, Campbell 
JV Perr>, Karl Rickels, David L Rosenban, Robert Rosenthal, and Ralph Rosnow 
for their thoughtful enUasms and many helpful suggestions in the preparation 
of this manuscript 


143 



DEMAND CHARACTERISUCS AND QUASI-CONTROLS 


145 


In less dramatic ways the subjects recognition that he is not merely 
responding to a set of stimuli but is doing so in order to produce data 
may exert an influence upon his performance Inevitably he will wish 
to produce “good” data, that is, data characteristic of a “good ' subject 
To be a “good” subject may mean many things to give tlie nght re 
sponses, i e , to give the kind of response characteristic of intelligent 
subjects, to give the normal response, i e , characteristic of healthy 
subjects, to give a response in keeping with the individual s self-percep- 
bon, etc , etc If the experimental task is such that the subject sees 
himself as being evaluated he will tend to behave in such a way as 
to make himself look good (The potential importance of this factor 
has been emphasized by Rosenberg, 1965, see Chapter 7 ) 

Investigators have tended to be intuitively aware of this problem and 
in most experimental situations tasks are constructed so as to be ambigu- 
ous to the subject regarding how any particular behavior might make 
him look especially good In some studies investigators have explicitly 
utilized subjects’ concern with the evaluation m order to maximize moti- 
vation However, when the subject’s wish to look good is not directly 
challenged, another set of mobves, one of the common bases for volun- 
teering, will become relevant That is, beyond idios) ncratic reasons for 
participating, subjects volunteer, in part at least, to further human kmowl- 
edge, to help provide a better understanding of mental processes that 
ultimately might be useful for treatment, to contribute to science, etc 
Tills wish which, despite currently fashionable cynicism, is fortunatel) 
still the mode rather than the exception among college student volun- 
leers, has important consequences for the subjects behavior Tlius, in 
order for the subject to sec the data as useful, it is essential that he 
•issume that the experiment be important, meaningful, and properly ex- 
ecuted Also, he would hope that the expenment work, whicli tends 
to mean that it prove what it attempts to prove Reasons such as 
these may help to clarify why subjects arc so committed to see a 
logical purpose in what would otherwise appear to be a trivial expen- 
mcnl, why they are so anxious to ascribe competence to tlic expcnnicntcr 
and, at the end of a study, arc so conccmcd that their data prove useful 
Tile same set of motives also helps to understand whv subjects often 
''all go to considerable trouble and tolerate great inconvtnH nee provided 
thev are encouraged to sec the expenment as important TvpicalK Ihej 
"dl tolerate even intense discomfort if it seems essential to the (Xpiri- 
^'■nt. on the other hand, they respond badlv indetxl to discomfort whicli 
diey rt'cugntre as due to the experimenters meptness. tncomp< tence. 

Of imhflerence Regardless of the exUnt to which tlna are reimhuned. 
OKiit subjects will 1 h* thoroughiv alienated if it liecomes appirent lliat. 
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but active, thinking human beings like ourselves The fear that know 
edge of the true purposes of an expenment might vitiate its results 
stems from a tacit recognition that the subject is not a passive responder 
to slimuh and experimental conditions Instead, he is an active partici 
pant in a special form of socially defined interaction which we ca 
taking part in an experiment’ .. 

It has been pointed out by Cnswell (1958), Festinger (1957), i j 
(1961), Rosenberg (1965), Wishner (1965) and others, and discussed 
at some length by the author elsewhere (Orne, 1959b, 1962), that su^^ 
jects are never neutral toward an experiment While, from the mves i 
gator’s point of view, the experiment is seen as permitting the contro e 
study of an individual’s reaction to specific stimuli, the situation ten 
to be perceived quite differently by his subjects Because subjects 
active, sentient beings, they do not respond to die specific 
stimuli with which they are confronted as isolated events but ra 
they perceive these in the total context of the expenmental 
Their understanding of the situation is based upon a great deal o ^ 
edge about the kind of realities under which scientific researc is 
ducted, its aims and purposes, and, m some vague way, the 
findings which might emerge from their participation and their respo 
The response to any specific set of stimuli, then, is a function o o 
stimulus and the subjects recognition of the total context Un w 
circumstances, the subject’s awareness of the implicit aspects o 
chological experiment may become the principal determinant ° ^ 

havior For example, m one study an attempt was made to 
tedious and intentionally meaningless task Regardless of the rn jy 
request and its apparently obvious triviality, subjects continued to^^ 
even when they were required to perform work and to destroy 
duct Though It was apparently impossible for the experimenter ^ 
how well they did, subjects continued to perform at a hi^ ra e 
and accuracy over a long penod of time They ascnbed pro 

course) a sensible motive to the experimenter and meaning ^ jjjjjed 
cedure While they could not fathom how this might be and 

they also quite correctly assumed that the experimenter 
would check their performance* (Ome, 1962) Again, m ano^ 
subjects were required to carry out such obviously 
as picking up a poisonous snake or removing a penny from « g^^plicd 
acid with their bare hands (Ome and Evans, 1965) Subjec j^pnatc 
correctly surmising that, despite appearances to the contrary, ♦ P 
precautions for their safety had been taken 


• These pilot studies were perfonned by Thomas Menaker 
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essential to establish whether the subject or the expenmenter is the 
one who is deceived by the experimental manipulation 


I. DEMAND CHARACTERISTICS AND 
EXPERIMENTER BIAS 

Demand characteristics and the subject’s reaction to them are, of 
course, not the only subtle and human factors which may afiEect the 
results of an experiment Experimenter bias effects, which have been 
studied m such an elegant fashion by Rosenthal (1963, 1966), also are 
frequently confounding variables Experimenter bias effects depend in 
large part on expenmenter outcome expectations and hopes They can 
become significant determinants of data by causing subtle but systematic 
differences in (a) the treatment of subjects, (b) the selection of cases, 
(c) observation of data, (d) the recording of data, and (e) systematic 
errors in the analysis of data 

To the extent that bias effects cause subtle changes in the way the 
experimenter treats different groups, they may alter the demand charac- 
teristics for those groups In social psychological studies, demand charac- 
tensbcs may, dierefore, be one of the important ways m which expen- 
menter bias IS mediated Conceptually, however, the two processes are 
very different Expenmenter bias effects are rooted in the motives of 
the experimenter, but demand charactenstic effects depend on the per- 
ception of the subject 

The effects of bias are by no means restricted to the treatment of 
subjects They may equally well function in the recording of data and 
its analysis As Rosenthal (1966) has pointed out, they can readily be 
demonstrated in all aspects of scientific endeavor — *‘N rays” being a 
pnme example Demand charactenstics, on the other hand, are a problem 
only when we are studying sentient and motivated organisms Light 
rays do not guess the purpose of the expenment and adapt themselves 
to it, but subjects may 

The repetition of an expenment by another investigator with different 
outcome onentalion will, if the findings were due to expenmenter bias, 
lead to different results This procedure, however, may not be sufficient to 
clarify the effects of demand characteristics Here it is the leanings of the 
subject, not of the expenmenter, that are involved In a real sense, for 
the subject an expenment is a problem-solving situation Riecken (1962, 
31) has succinctly expressed this when he sa)s that aspects of the expen- 
mental situation lead to ’'a set of inferential and interpretive activities 
on the part of the subject in an effort to penetrate tlic cxpcnmcntcr’s 
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for one reason or another, their experimental performance must be dis 
carded as data Interestingly, they will tend to become angry if this is 
due to equipment failure or an error on the part of the experimenter, 
whereas if they feel that they themselves are responsible, they tend to 
be disturbed rather than angry 

The individuals concern about the extent to which the expenment 


helps demonstrate that which the expenmenter is attempting to demon 
strate will, in part, be a function of the amount of involvement \vi 
the expenmental situation The more the study demands of him, t ® 
more discomfort, the more time, the more effort he puts into it, ® 
more he will be concerned about its outcome The student mac ass 
asked to fill out a questionnaire will be less involved than the volunteer 
who stays after class, who will in turn be less involved than the volunteer 
who IS required to go some distance, who will in turn be less invo v ^ 
than the volunteer who is required to come back many times, etc , etc 
Insofar as the subject cares about the outcome, his perception o is 
role and of the hypothesis being tested will become a significant 
nant of his behavior The cues which govern his perception—w ic 
communicate what is expected of him and what the experimenter op 
to find — can therefore be crucial vanables Some time ago I 
that these cues be called the ‘demand characteristics of an expenmen 
(Ome, 1959b) They include the scuttlebutt about the 
setting, implicit and explicit instructions, the person of the 
subtle cues provided by him, and, of particular importance, 
mental procedure itself All of these cues are interpreted m m 
of the subject’s past learning and expenence Although the exp 
strucbons are important, it appears that subtler cues from w 
subject can draw covert or even unconscious inference may be s i 


powerful 

Recogmzmg that the subject’s knowledge affects his 
vestigators have employed vanous means to disguise the true p^^ F 
of the research, thereby trying to alter the demand charactens ^ 
expenmental situations m order to make them orthogonal to the c F 
mental effects Unfortunately, the mere fact that an investiga o ^ b 
to great lengths to develop a “cute” xvay to deceive the jt js 

way guarantees that the subject is, in fact, deceived Obvious y 

this 

• Obviously, how the subject is treated will affect his motivabon j,p will 

If the expenmenter seems casual, disinterested or worse ‘"“"?^„penmenW 
both resent it and mobilize little investment On the other hand, if ^11 

seems both to care about the outcome and to appear competent, 
want to help even at great inconvenience to themselves Thus we 
seen subjects return from distant cities to complete a study 
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effects. The example is unusual only because some of its demands were 
deliberately manipulated and treated as experimental variables in their 
own right. The results of the explicit manipulation enabled us to under- 
stand an experimental result which was otherwise contrary to field 
findings. 

In recent years there have been a number of studies on the detection 
of deception — more popularly known as “lie detection” — with the gal- 
vanic skin response (GSR) as the dependent variable. In one such study, 
Ellson, Davis, Saltzman, and Burke ( 1952) reported a very curious find- 
ing. Their experiment dealt with the effect which knowledge of results 
can have on the GSR. After the first trial, some subjects were told that 
their lies had been detected, while others were told the opposite. This 
produced striking results on the second trial: those who believed that 
they had been found out became harder to detect the second time, 
while those who thought they had deceived the polygraph on Trial 
1 became easier to detect on Trial 2. This finding, if generalizable to 
the field, would have considerable practical implications. Traditionally, 
interrogators using field lie detectors go to great lengths to show the 
suspect that the device works by “catching” the suspect, as it were. 
If the results of Ellson et al. were generalizable to the field situation, 
the very procedure which the interrogators use would actually defeat 
the purpose for which it was intended by making subsequent lies of 
the suspect even harder to detect. 

Because the finding of Ellson et at runs counter to traditional practical 
experience, it seemed plausible to assume that additional variables might 
be involved in the experimental situation. The study by Ellson ct al 
was therefore replicated by Gustafson and Ome® with equivocal results. 
Poslexperimental interviews with subjects revealed that many college 
students apparently believe that the lie detector works xcith normal indi- 
viduab and that only habitual liars could deceive a polygraph. Given 
these beliefs, it was important for the student volunteers that they be 
detected. In that respect tlie situation of the experimental subjects differs 
markedly from that of the suspect being interrogated in a real life situa- 
tion. Fortunately, with the information about what most experimental 
subjects believe, it is possible to manipulate tlicsc beliefs and thereby 
change the demand characteristics of the Ellson ct at study. Two groups 
of subjects were given different information about the effectiveness of 
the lie detector. 

One group was given information congruent \rilh this widely licid 
belief and told: ‘Tliis is a detection of deception experiment. \Vc arc 
trx'ing to sec how well the lie detector worli. As you know, it is not 


V 


Unpn1)lislj«l stiJtly, 19G2. 
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inscrutability ” For example, if subjects are used as their om 
controls, they may easily recognize that differential treatment oug 
to produce differential results, and they may act accordingly A smilar 
effect may appear e\ en when subjects are not their own controls ose 
who see themselves as controls may on that account behave differenUy 
from those who think of themselves as the “expenmentals 

It IS not conscious deception by the subject which poses the pro em 
here That occurs only rarely Demand characteristics usually opera ^ 
subtly in interaction with other experimental variables They c ang^ 
the subject’s behavior in such a way diat he is often not cle^y awai^ 
of their effect In fact, demand charactenstics may be less e 
even have a paradoxical action if they are too obvious With the cons 
lation of motives that the usual subject brings to a psychologica . 
ment, the “soft sell’ works better than the “hard sell Rosentha 
has reported a similar finding m experimenter bias the effect w 
ened, or e\en reversed, if the expenmenter is paid extra to i 
results - 

It IS possible to ehminate the expenmenter entirely, as has een 
gested by Charles Slack* some years back in a Gedanken expenmw ^ 
proposed that subjects be contacted by mail, be asked to 
specific room at a specific time, and be given all instructions m a ^ 
form The recording of all responses as well as the reinforcem 
subjects would be done mechanically This procedure would 
way toward controlhng expenmenter bias Nevertheless, it wou 
demand charactenstics, as would any other expenment whic we 
conceive, subjects will always be in a position to form hypo os 
the purpose of an expenment jjjese 

Although every expenment has its o\vn demand charactens i . 
do not necessanly have an important effect on the indepen 

come important only when they interact with the effect o 
dent vanable being studied Of course, the most senous 
one where the investigator hopes to draw inferences from an e^ ^ 
where one set of demand characteristics typically operates 
life situation which lacks an analogous set of conditions 


n. PRE-INQUIRY DATA AS A BASIS 
MANIPULATING DEMAND CHARACTERISTIC 

1960) take^ 

A recent psychophysiological study (Gustafson and 
one possible approach to the clarification of deman 


• Personal communication, 1959 
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measure which is often erroneously assumed to be outside of volitional 
control, namely a physiological response — in this instance, the GSR. 
This study serves as a link toward resolving the discrepancy between 
the laboratory findings of Ellson et al (1952) and the experience of 
interrogators using the “lie detector * in real life. 

It appeared possible in this experiment to use simple variations in 
instructions as a means of varying demand characteristics. The success 

TABLE I 

Number of Successful and Unsuccessful Detections on Trial I 
FOR THE Two Subgroups of the n Detectfd and n Deceive Groups” 



Told detected 
(subsequently) 

Told not detected 
(subsequently) 

X* between 
columns 1 and 2 

"Need to be Detected Group’’ 

Detected 

9 

13 

x‘ = 1 31 

Not Detected 

7 

8 

n s 

"Need to Deceive Group” 

Detected 

13 

11 

x’-o 17 

Not Detected 

3 

5 

n 8 

x* Between n Detected 

x’ =< 1 31 

X* - 0 17 


and n Deceive Groups 

n s 

n s 



Note From'L A Gustafson and M T Orne, “Effects of perceived role and role success 
on the detection of deception,’’ Journal of Applied Psychology, 49, 1965, 412-417 
Copyright (1965) by the American Psychological Association, and reproduced by 
permission 

“ Note that Ss were not given information about the success of detection until after 
the trial on which these data are based 

** A multiple chi-square contingency analysis (Sutcliffe, 1957) was used to analyze 
the departures from expected frequencies m the entire table Neither the chi-square 
components for each variable alone, nor the interaction between variables, were 
significant 

of the manipulation may be ascribed to the fact that the instructions 
themselves reflected views that emerged from interview data, and both 
sets of instructions were congruent with the experimental procedure. 
Only if instructions are plausible — a function of their congruence ^vith 
the subjects’ past knowledge as well as with the experimental proce- 
dure — ^will they be a reliable way of altering the demand characteristics. 
In this instance the instructions were not designed to manipulate the 
subjects’ attitude directly, rather they were designed to provide differen- 
tial background information relevant to die experiment. This background 
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possible to detect lying in the case of psychopathic personalities or 
habitual liars We want you to try your very best to fool the lie detector 
during this experiment Good Iu<^” These instructions tried to maximize 
the kind of demand characteristics which might have been functiomng 
in the Ellson et al study, and it was assumed that the subjects would 
want to be detected in order to prove that they were not habitual liars 
The other group was given information which prior work {Gustafson 
and Ome, 1963 ) had shown to be plausible and motivating, they were 
told, This is a lie detection study and while it is extremely difficu t 
to fool the lie detector, highly intelligent, emotionally stable, and mature 
individuals are able to do so ” The demand characteristics m this case 
were designed to maximize the wish to deceive 

From that point on, the two groups were treated identically T ey 
drew a card from an apparently randomized deck, the card had a num 
her on it which they were to keep secret All possible numbers were 
then piesented by a prerecorded tape while a polygraph recorded ^ 
subjects’ GSR responses On the first such trial, the “detection ratios 
that IS the relative magnitudes of the cntical GSR responses— m the two 
groups were not significantly different (see Table I) When the first tna 
was over, the experimenter gave half the subjects in each group the irn 
pression that they had been detected, by telling them what their numo 
had been (The experimenter had independent access to this j 
lion ) The other half were given the impression that they had 
the polygraph, the experimenter reporting an incorrect number to ^ 

A table of random numbers was used to determine, indepen 
his actual GSR, which kind of feedback each subject received 
A second detection tnal with a new number was then given 
dramatic effects of the feedback in interaction with the ongina 
tions are visible in Table II Two kinds of subjects now 
GSRs to the critical number those who had wanted to be ^ 
but yet hid not been detected, and also those who had hoped to e 
and yet had nof deceived (This latter group is analogous to t ® ^ 

situation ) On the other hand, subjects whose hopes had been pf 

now responded less and thus became harder to detect, regar 
what those hopes had been Those who had wanted to be de 
and indeed hid been detected, behaved physiologically hke those 
had wanted to deceive and indeed had deceived . jybtk 

This effect is an extremely powerful but also an exceeding 
one The differential pretreatment of groups is not apparent 
tnal Only on the second trial do the manipulated demand chiw 
produce clear cut differential results, in interaction wth the 
vanable of feedback Furthermore, we are dealing wth a 
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and the experimental procedure may become more potent determmants 
of how the study is perceived 


m. DEALING TOTH DEMAND CHARACTERISTICS 

Studies such as the one described in which the demand charactenstics 
are deliberately manipulated contnbule little or nothmg to the question 
of how they can be dehneated In order to design the he detection 
experiment m the first place, a thorough understanding of the demand 
characteristics mvolved was essential How can such an understanding 
be obtamed^ As was emphasized earlier, ihe problem anses basically 
because the human subject is an active organism and not a passive 
responder For him, the expenment is a problem solving situation to 
be actively handled in some way To find out how he is trying to handle 
it, it has been found useful to take advantage of the same mental pro 
cesses which would otherwise be confounding the data Three techniques 
were proposed which do just that Although apparently different, the 
three me^ods serve the same basic purpose For reasons to be explained 
later, I propose to call them “quasi controls ” 

A. Postexperimental Inquiry 

The most obvious way of finding out something about the subject’s 
perception of the experimental situahon is the postexperimental inquiry 
It never fails to amaze me that some colleagues go to the trouble of 
mducing human subjects to participate m then experiments and then 
squander the major difference between man and animal — the ability 
to talk and reflect upon expenence 

To be sure, inquiry is not always easy The greatest danger is the 
“pact of Ignorance” (Ome, 1959a) which all too commonly charactenzes 
the postexperimental discussion The subject knows that if he has “caught 
on” to some apparent deception and has an excess of information about 
the experimental procedure he may be disqualified from participation 
and thus have wasted his time Tbe experimenter is aware that the 
subject who knows too much or has “caught on" to his deception wll 
have to be disqualified, disqualification means running yet another sub- 
ject, still further dela)'ing complebon of his study Hence, neither party 
to the inquiry wants to dig very deeply 

The investigator, aware of these problems and genuinely more inter- 
ested in learning what his subjects expcnenced than m the rapid collec- 
tion of data, can, however, learn a great deal about the demand charac- 
teristics of a particular expenmcntal procedure by judicious inquiry 
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informabon was designed to provide very different contexts for the sub 
lects’ performance within the experiment We believe this approach was 
effective because it altered the subjects’ percepUon of the expenmenlal 
situation, which is the basis of demand characteristics in any experiment 
It IS relevant that the differential instructions in no way told ™b)ecb 
to behave differently Obviously subjects in an expenment 
to do what they are told to do— that is the implicit contract ot the 
situation — and to demonstrate this would prove little Our effort ere 

TABLE n 


Number of Successful and Unsuccessful Detections on Trial B 
FOR the Two Subgroups of the n Detected and n Deceive Groups 



Told detected 
(subsequently) 

Told not detected 
(subsequently) 

X* between 
columns 1 aii<l ^ 

' Need to be Detected Group ’ 
Detected 

4 

14 

X* = 10 28 

Not Detected 

12 

2 

p N 0 

Need to Deceive Group 
Detected 

15 

3 

X* - 15 36 

Not Detected 

1 

13 


X* Between n Detected 

X* = 12 96 

X* * 12 55 


and n Deceive Groups 

p < 001 

p < 001 



Note From L A Gustafson and M T Orne, ‘Effects of perceived ro 
success on the detection of deception,’ Journal of Applied Psychologyi 
412 417 Copyright (1965) by the American Psychological Association, a 


duced by permission 

“ A multiple chi square contingency analysis here shows that neither 
given, nor motivation (n Detect v$ n Deceive) have significant effects ^ ^ g 5 ) 

The relevant chi square values calculated from partitioned subtables, are ug^ptly 

and 00 respectively (df = 1) However, successful detection does dep^ ® ^ ^ qq^ 

on the interaction between information and motivation (x* “ 
df = 1) 

was to create the kind of context which might differentiate ^ 

from the field situation and which might explain differentia 
these two concepts Plausible verbal instructions were one way o 
phshing this end ( Also see Cataldo, Silverman and Brown 
Kroger, 1967, Page and Lumia, 1968, Silverman, 1968 ) pretested 

Unless verbal instructions are yeiy des'gn'j ” 

they may well fail to achieve such an end It can be ex X 
to predict how, if at all demand charactenstics are altere y 
tions, and frequently more subtle ^pects of the experimen 
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at the very end or even in retrospect dunng the inquiiy itself, and 
he may then verbahze dunng the inquiry an awareness that will have 
had little or no effect on his performance during the expenment For 
this reason, one may ^vlsh to cany out inquiry procedures at significant 
junctures in a long expenment® This technique is quite expensive and 
tune consummg It requires running different sets of subjects to different 
pomts m the experiment, stopping at diese pomts as if the expenment 
were over (for these subjects it, in fact, is), and carrying out mquines 
While it would be temptmg to use the same group of subjects and 
to contmue to run them after the mquiT) procedure, such a techmque 
would in many instances be undesirable because exhaustive mquines 
into the demand charactenshcs, as the subject perceives them at a given 
pomt m time, make him unduly aware of such factors subsequently 

While mquiry procedures may appear time consummg, m actual prac 
tice they are relatively straightforward and efiBcient Certamly they are 
vastly preferable to finding at the conclusion of a large study that the 
data depend more on the demand characteristics than on the mdepen 
dent vanables one had hoped to investigate It is perhaps worth remem 
benng that, investigators bemg human, it is far easier to do exhaustive 
inquiry dunng pilot studies when one is still motivated to find out what 
IS really happenmg than in the late stages of a major investigation 
Indeed this is one of the reasons why pdot investigations are an essenbal 
prelude to any substantive study 

B Non-cxpenment 

Another technique — and a very powerful one — for uncov enng the de 
mand characteristics of a given expenmental design is the “pre inquiry” 
(Ome, 1959a) or the “non>expenment”f This procedure was mdepen 
dently proposed by Riecken (1962) A group of persons representing 
the same population from which the actual expenmental subjects wall 
eventually be selected are asked to imagine that they arc subjects them 
selves They arc shown the equipment that is to be used and the room 
m which the expenment is to be conducted The procedures are ex 
plained in such a way as to provide them with information equiv'alent 
to that which would be available to an expenmental subject However, 
the) do not actually go through the expenmental procedure, it is onl) 
explained In a non expenment on a certain drug for example, the partici 
pant would be told that subjects arc given a pill He would be showm 
the pilL The instnicUons destined for the expenmental subjects would 

* Th«e results maj also be conceptualized in lerms of learning theorv 

1 Ulnc Ncisser suggested this persuasive leim 
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It IS essential that he ehcit what the subject perceives the expenment 
IS about, what the subject believes the investigator hopes and expects 
to find, how the subject thinks others might have reacted in this situation, 
etc This information will help to reveal what the subject perceives 
to be a good response, good both in tending to validate the hypothesis 
of the expenment and in showing him off to his best advantage 
To the extent that the subject perceives the experiment as a problem 
solving situation where the subject's task is to ascertain the expenment s 
true nature, the inquiry is directed toward clarifying the subjects be le s 
about its true nature When, as is often the case, the investigator \vi 
have told the subject in the beginning something about why the expen 
ment is being earned out, it may well be difficult for the subject o 
express his disbehef since to do so might put him m the position o 
seeming to call the expenmenter a liar For reasons such as these, s 
postexperimental interview must be conducted with considerable 
and skill, creating a situation where the subject is able to communica e 
freely what he truly beheves without, however, making him un u y 
suspicious or, worse yet, cueing him as to what he is to say tJs g 
another investigator to carry out the inquiry will often maximize com 
munication, particularly if the other investigator is seen as someo 
who IS attempting to learn more about what the subject 
However, it is necessary to avoid having it appear as though the mquity 
IS carried out by someone who is evaluating the expenmenter si 
the student subject may identify with what he sees to be tl^ 
expenmenter and try to make him look good rather than 
his real experience The situational factors which will maximize t e 
ject’s communicating what he is expenenemg are clearly excee 
complex and conceptually similar to those which need to be ta en 
account in chmeal situations or in the study of taboo topics xa p 
of the factors are merely touched upon here , Hveen 

It would be unreasonable to expect a one-to one relationship c 
the kind of data obtained by inquiry and the demand 
which were actually perceived by the subject in the situation 
do many factors mitigate against fully honest commumcaticm, 
subject cannot necessarily verbahze adequately what he may £^ctors 
perceived dunng the expenment and it is the dimly perceive 
which may exert the greatest effect on the subject’s expenmen ^ 
haviors More important than any of these considerations, 
fact that an inquiry may be earned out at the end of a complex 
and that the subject’s perception of the expenments 
tics may have changed considerably dunng the expenment o 
a subject might “catch on’ to a verbal conditiomng expenm 
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appear to be counterexpectational, that is, the predictions made on the 
basis of intuitive “common sense” appear to be wrong whereas those 
made on the basis of dissonance are both different and borne out by 
data Bern (1967) has shown in an elegant application of pre-inquiry 
techniques that the findings are not truly counterexpectational in the 
sense that subjects to whom the situation is described in detail but 
who are not really placed in the situation are able to produce data 
closely resembling those observed in typical cognitive dissonance studies 
On the basis of these findings, Bern (1967) appropriately questions the 
assertion that the dissonance theory allows counterexpectational predic 
tions His use of the pre inquiry effectively makes the cognitive dis- 
sonance studies it replicates far less compelling by showing that subjects 
could figure out the way others might respond It would be unfortunate 
to assume that Bern’s incisive critique of the empirical studies with the 
pre inquiry technique makes further such studies unnecessary On the con 
trary, his findings merely show that the avowed claims of these studies 
were not, in fact, achieved and provide a more stnngent test for future 
experiments that aim to demonstrate counterexpectational findings 
It would appear that we are in the process of completing a cycle 
At one time it was assumed that subjects could predict their own be 
havior, that in order to know what an individual would do in a given 
situation it would suffice merely to ask him It became clear, however, 
that individuals could not always predict their behavior, m fact, senous 
questions about the extent to which they could make any such predic- 
tions were raised when studies showing differences between what indi- 
viduals thought they do and what they, in fact, do became fashion 
able With a sophisticated use of the pre-inquiry technique Bern (1967) 
has shown that individuals have more knowledge about what they might 
do than has been ascribed to them by psychologists Although it is 
possible to account for a good deal of variance in behavior in this way. 

It IS clear that it will not account for all of the vanance We are con- 
fronted now with a peculiar paradox When pre inquiry data correctly 
predict the performance of the subject in the actual expenment — the 
situation that is most commonly encountered — the experimental findings 
strike us as relatively trivial, in part because at best we have validated 
our intuitive common sense but also because we cannot exclude the 
’'^ggiog doubt that the subject may have merely been responsive to 
the demand charactenstics in the actual expenment Only when we 
succeed in setting up an experiment where the results are counterexpcc- 
tational in the sense that a pre-inquiry would yield different findings 
from those obtained from the subjects in the actual situation can we 
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be read to him The participant would then be ashed to produce data 
as if he actually had been subjected to the expenmental treatment 
He could be given posttests or asked to fill out rating scales or requeste 
to carry out any behavior that might be relevant for the actual expen 
mental group , 

The non experiment yields data similar in quality to inquiry matena 
but obtained in the same form as actual subjects’ data Direct comparison 
of non experimental data and actual experimental data is therefore possi 
ble But caution is needed If these two kinds of data are identica , 
it shows only that the subject population m the actual expenment cou 
have guessed what was expected of them It does not tell us w et er 
such guesses were the actual determinants of their behavior 
Kelman (1965) has recently suggested that such a technique 
appropriately be used as a social psychological tool to obviate the nee 
for deception studies While the economy of this procedure is appea ing- 
and working in a situation where subjects become quasi collabora or 
instead of objects to be manipulated is more satisfying to many o 
it would seem dangerous to draw inferences to the actual .. 

real hfe from results obtained m this fashion In fact, when su 
m pre inquiry experiments perform exactly as subjects do in ac u 
perimental situations, it becomes impossible to know the extent to w 
their performance is due to the independent variables or to the e P 
mental situation 

In most psychological studies, when one is investigating 
of the subjects best possible performance in response to i 
physical or psychological stimuli, there is relatively l-t* conoemj^ ^ 
the kind of problems introduced by demand characteristics ^yj^en 
to concern oneself with these issues becomes far more pronounce _ 
investigating the effect of vanous interventions such as drugs, p Y 
therapy, hypnosis sensory depnvation, conditioning of physio 
sponses, etc , on performance or expenential by 

bihty that the subjects response may inadvertently be de 
altered demand charactenstics rather than the process itse 
considered Equally subject to these problems are studies gator’s 

changes rather than performance changes are explored The and 

intuitive recognition that subjects’ perx«ptions of an expen 
its meaning are very likely to affect the nature of his been 

have been one of the mam reasons why deception studies 
so popular in the investigation of attitude change articul^r^/ 

Festmger’s cognitive dissonance theory (1957) has been p 
attractive to psychologists probably because it makes pre ic 
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IS found between deeply hypnobzed subjects and simulators Such data 
are not evidence that hypnosis consists only of a reaction to demand 
characteristics It may well have special properties But so long as a 
given form of behavior is displayed as readily by simulators as by “reals,” 
our procedure has failed to demonstrate those properties The problem 
here is the same as that discussed earlier in the pre-inquiry Most hkely 
there will be many real effects due to hypnosis which can be mimicked 
successfully by simulators However, only when we are able to demon 
strate differences in behavior between real and simulabng subjects do 
we feel that an experiment is persuasive in demonstrating that a given 
effect IS likely to be due to the presence of hypnosis 


IV. QUASI-CONTROLS: TECHNIQUES FOR THE 
EVALUATION OF EXPERIMENTAL ROLE DEMANDS 

The three techniques discussed above are not hke the usual control 
groups m psychological research They ask the subject to participate 
actively in uncovenng explicit information about possible demand char- 
acteristic effects The quasi-control subject steps out of his traditional 
role, because the experimenter redefines the interaction behveen them 
to make him a co-mvestigator instead of a manipulated object Because 
die quasi-control is outside of the usual expenmenter subject relation 
ship, he can reveal the effects of this relationship in a new perspective 
An mquiry, for example, takes place only after the expenment has been 
defined as ‘finished,” and the subject joins the expenmenter in reflectmg 
on his own earher performance as a subject In the non experiment, the 
quasi control cooperates with the expenmenter m second guessing what 
real subjects might do Most dramatically, the simulating subject reverses 
the usual relationship and deceives the expenmenter 

It IS difficult to find an appropnate term for these procedures They 
are not, of course, classical control groups since, rather than merely 
omitting the independent vanable, the groups are treated differently 
Thus we are dealing with treatment groups that facihtate inference about 
the behavior of both expenmental and control groups Because these 
treatment groups are used to assess the effect that the subject s perception 
of being under study might haie upon his behavior in the experimental 
Situation, they may be conceptualized as role demand controls in that 
they clarify the demand characteristic vanables in the expenmental 
situabon for the particular subject population used As quasi controls, 
the subjects are requued to participate and utilize their cognitive pro 
cesses to evaluate the possible effect that thinking about the total situa- 
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be relatively comfortable that these findings represent the real efiects 
of the experimental treatment radier than being subject to alternative 
explanations 

For the reasons discussed above, pre-inquiry can never supplant the 
actual investigation of what subjects do in concrete situations althoug , 
adroitl) executed, it becomes an essential tool to clarify these findings 

C. Simulators 

This principle can be earned one step further to provide yet another 
method for uncovering demand characteristics the use of simulators 
{Orne, 1959a) Subjects are asked to pretend that they have sen 
affected by an expenmental treatment which they did not actua y re- 
ceive or to which they are immune For subjects to be able to do is, 
it IS crucial that they be run by another experimenter who they are 
told IS unaware of their actual status, and who in fact really is unaware 
of their status It is essential that the subjects be aware that 
menter is blind as well as that the expenmenter actually be blm o 
this techmque to be effective Further, the fact that the experimenter 
‘ blind” has the added advantage of forcing him to treat simulators 
actual subjects alike This technique has been used extensively in 
study of hypnosis (eg, Damaser, Shor, and Ome, 1963, Ome, ^ 

Ome and Evans, 1965, Orne, Sheehan, and Evans, 1968) For 
tended discussion, see Ome (1968) It is possible for unhypnotize s ^ 
jeets to deceive an experimenter by acting as though they ha 
hypnotized Obviously, it is essential that the simulators be 
special training relevant to the variables being studied, so I ^ 
have no more information than what is available to actually ^.^5 

subjects The simulating subjects must try to guess what rea 
might do in a given experimental situation in response to ms ru 
administered by a particular experimenter - jjc- 

This design permits us to separate expenmenter bias effects r 
mand characteristic effects In addition to his other functions, 
penmenter may be asked to judge whether each subject is 
or a simulator This judgment lends to be random and f 
the true status of the subjects Nevertheless, we have gr not 

cnccs between the behaviors of subjects contingent on " ^ ^ ^jJalinC 
llic expenmenter judges that they are hypnotized or just si 
Such differences ma) be asenbed to differential treatment 
whereas differences between actually hypnotized subjects a 
simulators arc likel) to be due to hypnosis itself -iJuatioa- 

Again, results obtained with this technique need carefu 
It IS important not to jump to a negative conclusion if no 
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of expert discussion The use of quasi controls, however, allows the 
investigator to estimate these factors and how they might affect the 
experimental results 

The kmd of factors which we are discussing here relate to the manner 
in which subjects are solicited (for example, the wording of an an 
nouncement in an ad), the manner in which the secretary or research 
assistant answers questions about the proposed expenment when subjects 
call in to volunteer, the location of the experiment ( i e , psychiatric 
hospital versus aviation training school), and, finally, a great many de- 
tails of the experimental procedure itself which of necessity are simplified 
in the description, not to speak of the subtle cues made available by 
the investigator himself Quasi-controls are designed to evaluate the 
total impact of these various cues upon the particular kind of population 
which IS to be used It will be obvious, of course, that a verbal condition 
mg experiment carried out with psychology students who have been ex- 
posed to the original paper is by no means the same as the identical 
experiment earned out with students who have not been exposed to this 
information Again, quasi-controIs allow one to estimate what the demand 
characteristics might be for the particular subject population being used 

Quasi-controls serve to clanfy the demand characteristics but they 
can never yield substantive data They cannot even prove that a given 
result is a function of demand charactenstics They provide information 
about the adequacy of an investigative procedure and thereby permit 
the design of a better one No data are free of demand characteristics 
but quasi-controls make it possible to estimate their effect on the data 
which we do obtain 


V. THE USE OF QUASI-CONTROLS TO MAKE POSSIBLE 
A STUDY MANIPULATING DEMAND CHARACTERISTICS 

When extreme variations of eiqienmental procedures are still able 
to elicit surpnsingly similar results or identical expenmental procedures 
earned out in different laboratories yield radically different results, the 
hkehhood of demand characteristic effects must be seriously considered 
An area of investigation characterized in this way were the early studies 
on “sensory depnvation” The initial findings attracted wide attention 
because they not only had great theoretical significance for psychology 
but seemed to have practical implications for the space program as 
well A review of the literature indicated that dramatic hallucinatory' 
effects and other perceptual changes were typically' observed after the 
subject had been in the expenment approximately' tvvo-thirds of the 
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tion might have on their performance They could, in this sense, be con 
sidered active, as opposed to passive, controls 
A unique aspect of quasi-controls is that they do not permit inference 
to be drawn about the effect of die independent variable They can 
never prove that a given finding m the experimental group is due to 
the demand charactenstics of the situation Rather, they serve to suggest 


alternative explanations not excluded by the experimental design em 
ployed The inference from quasi control data, therefore, pnmanly con 
cems the adequacy of the experimental procedure In this sense, the 
term design control or evaluative control would be justified 

Since each of these various terms focuses upon different but equal y 
important aspects of these comparison groups, it would seem best to 
refer to them simply as quasi controls This explicitly recogmzes that we 
are not dealing with control groups m the true sense of the word an 
are using the term analogously to the way m which Campbell ^ 
Stanley (1963) have used the term quasi experiments However, w le 
they think of quasi experiments as doing the best one can m sifeahons 
where ‘true experiments ’ cannot be earned out, the concept of quasi 
controls is intended to refer specifically to techniques for the assessmen 
of demand characteristic variables in order to evaluate how such ac or 
might effect the experimental outcome The term "quasi ” m this 
text says that these techniques are similar to — but not really ^ 
groups It does not mean that these groups are any less impwf®^ 
helping to evaluate the data obtained from human subjects In 
the gap from the laboratory experiments to situations where t e in 
vidual does not perceive himself to be a subject under investiga 
techniques of this kind are of vital importance 

It IS frequently pointed out that inveshgators often discuss the 
mental procedures with colleagues in order to clanfy their mea ^ 
Certainly many problems in experimental design will be 
to expert colleagues These types of issues have typically been isc 
in the context of quantitative methods and have led to some ^vpert 
rate techniques of experimental design There is no question t a 
colleagues are sensitive to order effects, baseline phenomena, 
effects sampling procedures, individual differences, and so on, ^ 
a given subject population would, m fact, perceive an 
cedure is by no means easily accessible to the usual tools of the 
gists Whether in a deception experiment the subject may c 
or fully aware of what is really going on is a function of a of 

cues in the situation not easily explicated, and the pno*" ^ 
the subject population which might m some way be re evan 
experiment is also not easily ascertained or abstracted by any 



TABLE III 

Summary and Analysis of Ten Tests for Control and Experimental Groups 


Test and group 

Mirror Tracing (errors) 

Experimental 

Control 

Spatial Orientation 

Angular deviation 
Experimental 
Control 

Linear deviation 
Experimental 
Control 

Word Recognition (N correct) 
Experimental 
Control 

Reversible Figure (rate per minute) 
Experimental 
Control 

Digit Symbol (iV correct) 

Experimental 

Control 

Mechanical Ability 

Tapping speed (N completed) 
Experimental 
Control 

Tracing speed (N" completed) 
Experimental 
Control 

Visual pursuit (N completed) 
Experimental 
Control 

Simple Forms (W increment distortions) 
Experimental 
Control 

Size Constancy (change in steps) 
Experimental 
Control 

Spiral Aftereffect 

Duration, seconds 
Experimental 
Control 

Absolute Change 
Experimental 
Control 

Logical Deduction (iV correct) 
Experimental 
Control 


Pretest M 

Posttest M 

Difference 

statistic 

28 1 

35 8 

19.7 

15 2 

F = 1 67“ 

45 7 

52 5 

53 9 

59 1 

F = 25“ 

5 3 

6.4 

5 4 

5 7 

F = 3.34» 

17 3 

15 2 

15 6 

12 3 

t = 50 

29 0 

20.1 

35 0 

25 0 

F = 1 54“ 

98 2 

99.2 

109 9 
111.9 

F 05“ 

33 9 

32.9 

32 2 

35 0 

F = 2 26 

55 6 

53.1 

52 3 

58 4 

F = 4 57* 

5 7 

5 7 

8 9 

9 2 

F = .22“ 

— 

3 1 

0 8 

U = 19* 

- 

0 6 

0 0 

f = 1 03“ 

24 4 

15.6 

27 1 

16 1 

F = .99“ 

— 

7 0 

2 7 

f = 3 38' 

- 

20 3 

99 1 

{ » 1 64 


F m adjusted postexperimental scorva, anslyaia of cosananee, J “ J teaU, U Mann-Whitney 
teat, where plot of data appeared grosaly abnormal. {From M. T. Ome and K. E Seheibe, The 
*®*'*^^'‘f‘o** of nondepnvation factora in the production of aeosory deprivation eSeeta The psycholoEy 
of the panic button.’" Journal of Abnormal and Social PaycboloRy. «8, 1904, 3-12 Copj right (lOOtJ by 

* Amenean rtychological Association, and reproduced by permission ) 

* Indicates differences betneen groups were in predicted direction. 

‘ P < .05. one-tailed. 

* P * .01, one-tailed 

P ^ 001. Pondirectional measure. 
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total time, however, it seemed to matter relatively little whether the 
total time was three weeks, two weeks, three days, two days, twenty-four 
hours, or eight hours Clearly, factors other than physical conditions 
would have to account for such discrepancies As a first quasi-rontro 
we interviewed subjects who had participated in such studies It ecame 
clear that they had been aware of the land of behavior that was expecte 
of them Next, a pre-inquiry was carried out, and, from participants 
who were guessing how they might respond if they were in a sensory 
deprivation situation, we obtained data remarkably like that observe 
in actual studies t We were then m a position to design an actua 
experiment in which the demand charactenstics of sensory deprivation 
were the independent variables (Ome and Scheibe, 1964) Our resu 
showed that these characteristics, by themselves, could produce 
of the findings attnbuted to the condition of sensory deprivation ^ 
brief, one group of the subjects were run in a “meaning depriva on 
study which included the accoutrements of sensory deprivation 
but omitted the condition itself They were required to undergo a p 
cal examination, provide a short medical history, sign a release 
were "assured” of the safety of the procedure by the presrace o 
emergency tray containing various syringes and emergency cirug , 
were taken to a well-lighted cubicle, provided food and water, an 
an optional task After taking a number of pretests, the subjec s 
told that if they heard, saw, smelled, or experienced ^ 

they were to report it through the microphone m the room T ey 
again reassured and told that if they could not stand the 
longer or became discomforted they merely had to press the re P 
button” in order to obtain immediate release mental 

They were then subjected to four hours of isolation m the 
cubicle and given posttests The control subjects were told t 
were controls for a sensory depnvalion study and put m the jjje 

live conditions as the expenmcntal subjects Table HI 
findings which indicate that manipulation of the demand charac 
by themselves could produce many findings that had the 

asenbed to the sensory depnvation condition. Of course, n 
quasi-controls nor the cxpenmental manipulation of the deman 
tcnstics sheds light on the actual effects of the condition o 
depnvation They do show that demand charactenstics may p 
similar effects to those asenbed to sensory depnvation 

• Unpublished stud) (]epn'>' 

J Stare. F, Brossn, J , and Ome, M T Demand ^nter 

tion studies Unpubhshed seminar paper, Massachusetts Menta 
HarsTird Umsersit), 1959 
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Experimental 
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Experimental 
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Pretest M 

Posttest M 

Difference 

statistic 

28 1 

35 8 

19 7 

15.2 

F = 1 67“ 

45.7 

52 5 

53 9 

59 1 

F = 25“ 

5 3 

6 4 

5 4 

5 7 

F = 3 34‘ 

17 3 

15 2 

15 6 

12 3 
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29 0 

20 1 

35 0 

25 0 

F = 1 54“ 

98 2 

99 2 

109.9 

111 9 

F = 05“ 

33 9 

32 9 

82 2 

35 0 

F = 2 26 

55 6 

63 1 

62 3 

58.4 

F « 4 57* 

5 7 

5 7 

8 9 

9 2 

F - 22“ 

— 

3 1 

0 8 

U = 19' 

— 

0 6 

0 0 

t = 1.03“ 

24 4 

15.6 

27.1 

16 1 

F = 99“ 



7 0 

2 7 

f = 3 38' 

- 

20 3 

22 1 

f « 1.64 


F n adjusted postexpenmental scores, anatyeis of eo^snanee, ( “ t tests, U " Msnn*\Vhitney 
test, where plot of data appeared icrosaly abaormal (From M. T. Ome and K. E. Seheibe, The 
contribution of nondepnvation factors id the production of seoiory depmatioa effects The psycboloxy 
«f the ‘panic button.*" Journal of Abnormal and Social PsycholoBy, es, lOM, 3-12 Copynfht (H)«l) by 
e Anencan Psychotoipcal Association, and reproduced by permission ) 

* Indicates differences between croups were id predicted direction 

* P < .05, one-tailed. 

* P w .Ot. one-tsiiled. 

^ P < .001, BondirecUonal measure. 
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VI. THE PROBLEM OF INFERENCE 


Great care must be taken in drawing conclusions from expenments 
of this kind In the case of the sensory deprivation study, the deman 
characteristics of the laboratory and those which might be encountere 
by individuals outside of the laboratory differ radically In other situa 
tions, however such as in the case of hypnosis, the expectations o 
subjects about the kind of behavior hypnosis ought to elicit in the labora 
tory are similar to the kind of expectations which patients might have 
about being hypnotized for therapeutic purposes To the extent t a 
the hypnotized individual s bdiavior is determined by these expectations 
we might find similar findmgs in certain laboratory contexts and certain 
therapeutic situations When demand characteristics become a sigmfican 
determinant of behavior, valid accurate predictions can only be ma 
about another situation where the same kind of demand charactenstics 
prevails In the case of sensory deprivation studies, accurate predictions 
would therefore not be possible but, even in the studies with hypnosis, 
we might still be observing an epiphenomenon which is present ony 
as long as consistent and stable expectations and beliefs are presen 
In order to get beyond such an epiphenomenon and find intrinsic c 
aclenstics, it is essential that we evaluate the effect that demand ^ 
teristics may have To do this we must seek techniques specifica y 
signed to estimate the likely extent of such effects 


VII PSYCHOPHARMACOLOGICAL RESEARCH AS A MODEL 
FOR THE PSYCHOLOGICAL EXPERIMENT 

What are here termed the demand characteristics of the 
Situation are closely related to what the psychopharmacologist 

a placebo effect, broadly defined The difficulty m the 

aspects of a subject's performance may legitimately be ascn c 
independent vanable as opposed to those which might be ue 
demand characteristics of the situation is .n;.lartotheprobkmO,^^^_ 
mining what aspects of a dru^s action are due to ® 

and what aspects are due to the subjects awareness that he ^ 

given a drug Perhaps because the conceptual distinction e 
drug effect and the effect of psychological factors is rcadi y rn^ 
haps because of the relative ease wth which placebo contro 
included, or most likely because of the very into 

of ps) chopharmacological research, considerable effort as g 
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differentiating pharmacological action from placebo effects A brief re- 
view of relevant observations from this field may help clarify the problem 
of demand characteristics 

In evaluating the effect of a drug it has long been recognized that 
a patient's expectations and beliefs may have profound effects on his 
experiences subsequent to the taking of the drug It is for this reason 
that the use of placebos has been widespread The extent of the placebo 
effect IS remarkable Beecher (1959), for example, has shown that in 
battlefield situations saline solution by injection has 90 per cent of the 
effectiveness of morphine in alleviating the pain associated with acute 
injury In civilian hospitals, postoperatively, the placebo effect drops 
to 70 per cent of the effectiveness of morphine, and with subsequent 
administrations drops still lower These studies show not only that the 
placebo effect may be extremely powerful, but that it will interact with 
the experimental situation in which it is being investigated 

It soon became clear that it was not sufficient to use placebos so 
long as the investigator knew to which group a given individual be 
longed Typically, when a new, presumably powerful, perhaps even dan- 
gerous medication is administered, the physician takes additional care 
in watching over the patient He tends to be not only particularly hopeful 
but also particularly concerned Special precautions are instituted, nurs- 
ing care and supervision are increased, and other changes m the regime 
inevitably accompany the drugs administration Wffien a patient is on 
placebo, even if an attempt is made to keep the conditions the same, 
there is a tendency to be perfunctory with special precautions, to be 
more cavalier with the patients complaints, and in general to be less 
concerned and interested in the placebo group For these reasons, the 
doctor, as well as the patient, is required to be blind as to the true 
nature of a drug, otherwise differential treatment could well account 
for some of the observed differences between drug and placebo (Modell 
and Houde, 1958) The problems discussed here would be conceptualized 
m social psychological terms as F-bias effects or differential F-outcome 
expectations 

^Vhat would appear at first sight to be a simple problem — to determine 
the pharmacological action of a drug as opposed to those effects which 
imy be attnbulcd to the patients awareness that he is being treated 
by presumably effective medication— turns out to be extremely difficult 
Indeed, as Ross, Krugman, Ljerly, and Cljde (1962) have pointed out, 
and as discussed by Lana (Chapter 4), tlic usual clinical techniques 
can never evaluate the true pharmacological action of a drug In practice, 
patients are given a drug and realize that they arc being treated, there- 
fore one always observes the pharmacological action of the drug con- 
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founded with the placebo effect The typical study with placebo controls 
compares the effect of placebo and drug versus the effect of placebo 
alone Such a procedure does not get at the psychopharmacological 
action of the drug without the placebo effect, i e , the patient s awareness 
that he is receiving a drug Ross et al elegantly demonstrate this point 
by studying the effect of chloral hydrate and amphetamine in a 3 X 
design Amphetamine, chloral hydrate, and placebo were used as ree 
agents with three different instructions (a) administering each capsue 
with a brief description of the amphetamine effect, (b) administering 
each capsule with a brief description of the chloral hydrate effect an 
(c) administration without the individuals awareness that a drug was 
being administered Their data clearly demonstrate that drug ® 
interact with the individuals knowledge that a drug is £^*’6 
administered . 

For clinical psychopharmacology, the issues raised by Ross ef 
somewhat academic since m medical practice one is almost alwaj^ c 
mg with combinations of placebo components and drug effects u 
evaluating the effect of drugs are intended to draw inference a 
how drugs work in the context of medical practice To the exten 
one would be interested in the psychopharmacological effect as 
that IS, totally removed from the medical context — the type o 6 S 
Ross et al utilized would be essential 
In psychology, expenments are carried out in order to 
effect of an independent vanable so that it will be possible to 
inference to non experimental situations Unfortunately the mdepe 
vanables tend to be studied m situations that are explicit y 
as experimental As a result, one observes the effect of an expenm 
context in interaction with a particular independent variable vers 
effect of the expenmental context without this vanable 2 ition 

The problem of the expenmental context in which an invcs i ^ 
IS earned out is perhaps best illustrated m psychopharmaco 
search on the effects of meprobamate ( known under the tra c 
of Equaml and Miltown) Meprobamate had been establishe ‘ 
tive in a number of chmcal studies but, when carefully con^o e 
gallons were earned out, it did not appear to be more efficaci 
placebo The findings from carefully controlled studies appeare ^ 

tradict a large body of clinical observations vhich one mig for 
tendency to discount as simply due to placebo effect It rem 
Fisher, Cole, Rickels, and Uhlenhuth (1964) to design a ^ 
imestigation to chnfy this paradox, using phjsicians disp 
a "scientific,” skeptical attitude tow ard medicabon or ent un 
the possible help which the drug would )^eld The study was 
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blind The patients treated by physicians with a “scientific” attitude 
toward medication showed no difference behveen drug and placebo, 
however, those treated by enthusiastic physicians clearly demonstrated 
an increased effectiveness of meprobamate^ It would appear that there 
IS a “real” drug effect of meprobamate which may, however, be totally 
obscured by the manner m which the drug is administered The effect of 
the drug emerges only when medication is admmistered with conviction 
and enthusiasm The stnkmg interaction between the drug effects and 
situation-specific factors not only pomts to hmitations in conclusions 
drawn from double-bhnd studies m psychopharmacology but also has 
broad methodological implications for the experimental study of psycho- 
logical processes An example of these implications from an entirely 
different area is the psychotherapy study by Paul (1966) which showed 
differences in improvement between individuals expecting to be helped 
at some time m the future and a matched control group who were not 
aware that they were included in the research 


Vin. DEALING WITH THE PLACEBO EFFECT: AN ANALOGY 
TO DEALING WITH DEMAND CHARACTERISTICS 

Drug effects that are independent of the patients expectations, beliefs, 
and attitudes can of course be studied with impunity without concern 
about the psychological effects that may be attnbuted to the taking 
of medication For example, the antipyrebc fever-rcducmg effect of as- 
pirin is less hkely to be influenced by the patients beliefs and expecta- 
tions than IS the analgesic effect, though even here an empincal ap- 
proach IS considerably safer than a prion assumptions 

Of greatest relevance arc the psychological effects of drugs Tlic prob- 
lems encountered in studying these effects, while analogous to those 
wherent m otlier kinds of psychological research, seem more cMdent 
bore Since the dnig constitutes a tangible independent \anablc (sub- 
ject to study by pharmacological techniques), it is conceptually casil) 
distinguished from another set of independent \ ambles, psychological 
m nature, that also may* play a crucial part in determining the patients 
response 

The totality of these non drug effects ^\hicli arc a function of the pi- 
hcnl*s expectations and beliefs m mlcracUon \%nth the medical procedures 
lhat arc earned out, the doctor’s expectations, and the manner in uhich 
he IS treated ha\c been conceptualized as placebo effect Tins is, of 
«>ursc, analogous to the demand characlcnstic components m psycholog- 
ical studies, the major difference is Uiat the concept of placebo compo- 
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nent directly derives from methodological control procedures used to 
evaluate it 

The placebo is intended to produce the same attitudes, expectations, 
and beliefs of the patient as would the actual drug The double blind 
technique is designed to equate the environmental cues which would 
interact with these attitudes For this model to work, it is essential 
that the placebo provide subjective side effects analogous to the actual 
drug lest the investigator and physician be blind but the patient fully 
cognizant that he is receiving a placebo For these reasons an active 
placebo should be employed which mimics the side effects of the drug 


without exerting a central pharmacological action 
With the use of active placebos administered by physicians having 
appropriate clinical attitudes in a double blind study a technically diffi 
cult but conceptually straightforward technique is available for the eva 
nation of the placebo effect This approach satisfies the assumptions 
of the classical experimental model One group of patients respon s 
to the placebo effect and the drug, the other group to the placebo 
effect alone, which permits the investigator to determine the additive 
effect that may be attributed to the pharmacological action of the drug 
Unfortunately such an ideal type of control is not generally availab ® 
m the study of other kinds of independent variables This is particular y 
true regarding the context of such studies Thus, the placebo technique 
can be applied in clinical settings where the patient is not aware t a 
he IS the object of such study whereas psychological studies most |re 
quently are recognized as such by our subjects who typically ar® 
to volunteer Because a true analog to the placebo is not readily avai 
able, quasi control techniques are being proposed to bridge the infereu 
tial gap between experimental findings and the influence of the expen 
mental situation upon the subject who is aware that he is particvpati j. 


in an exnenmfint 
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course, we might well observe relatively little effect Then as we get 
used to the drug a bit we might see it causes relaxation, a lessening 
of control, perhaps even some slurring of speech, m fact, some of the 
kind of changes typically associated with alcohol 

At this point, working with relatively small dosages of the drug, we 
would find that there were wide individual differences in response, some 
individuals actually becoming hyperalert and one might wonder to what 
extent the effects could be related to subjects’ beliefs and expectations 
Under these circumstances the inquiry procedures discussed earlier could 
be carried out after the drug had been given One would focus the 
inquiry on what the subject feels the drug might do the kind of side 
effects he might expect, what he anticipated he would experience subse- 
quent to taking the drug, what he thought we would have expected 
to happen, what he believed others might have expenenced after taking 
the drug etc Data of this kind might help shed light on the patient’s 
behavior 

Putting aside the difficulty of interpreting inquiry matenal, and assum 
mg we are capable of obtaining a good approximation of what the 
subject really perceived, we are still not in a position to determine the 
extent to which his expectations actually contnbuted to the effects that 
bad been observed Consider if a really large dose of amytal had been 
given essentially all subjects would have gone to sleep and would most 
bkely have correctly concluded they had been given a sleeping pill — the 
mquiiy data in this instance being the result of the observed effect 
rather than the cause of it Inquiry data would become suggestive only 
if (in dealing with relatively small dosages) it were found that subjects 
who expected or perceived that we expected certain kinds of effects 
did in fact show these effects whereas subjects who had no such expecta 
tions failed to show the effects Even if we obtained such data, however, 
it would still be unclear whether the subject’s perceptions were post 
hoc or propter hoc The most significant use of inquiry material would 
he m facilitating the recognition of those cues in the situation which 
might communicate what is expected to the subject so that these cues 
could be altered systematically Neither subject nor investigator is reall) 
m a position to evaluate how much of the total effect may legitimately 
he ascribed to the placebo response and how much to drug effect Evalu- 
ation becomes possible only after subsequent changes in procedure can 
be shown to eliminate certain effects c\cn though the same drug is 
being administered, or, conversely, subjects’ perceptions upon inquir) 
arc changed without changing the observed effect Tlie approach then 
"Ould be to compare the effect of the drug in interaction witli different 
sets of demand cliaraclcnstics in order to estimate how much of the 



168 


MARTIN T ORNE 


nent directly derives from methodological control procedures used to 
evaluate it 

The placebo is intended to produce the same attitudes, expectation^ 
and beliefs of the patient as would the actual drug The double b 
technique is designed to equate the environmental cues which wou 
interact with these attitudes For this model to work, it is essentia 
that the placebo provide sub}ective side effects analogous to the actua 
drug lest the investigator and physician be blind but the patient u y 
cognizant that he is receiving a placebo For these reasons an active 
placebo should be employed which mimics the side effects of the rug 
without exerting a central pharmacological action 

With the use of active placebos administered by physicians awng 
appropriate clinical attitudes in a double-blind study a technica y ^ ^ 
cult but conceptually straightforward technique is available for e eva 
uation of the placebo effect This approach satisBes the assump 
of tlie classical experimental model One group of patients 
to the placebo effect and the drug, the other group to the P 
effect alone, which permits the investigator to determine the a ^ 
effect that may be attributed to the pharmacological action of t e 
Unfortunately such an ideal type of control is not generally j 
in the study of other kinds of independent vanables This is 
true regarding the context of such studies Thus, the placebo tec 
can be applied in clinical settings where the patient is not aware 
he IS the object of such study whereas psychological studies 
quently are recognized as such by our subjects who typically are 
to volunteer Because a true analog to the placebo is not rea ^ 


lu vwiumeci xjeuuusfc: a true unaiog lo me — loferen 

able, quasi control techniques are being proposed to bridge e i ^ 
tial gap between experimental findings and the influence o t e ^ 
mental situation upon the subject who is aware that he is par i r 
in an experiment contextual 

The function of quasi controls to determine the possible 
effects of an experimental situation is perhaps clarified ' -jjjcebo 
contrast them with the use of placebos m evaluating possi ® 
effect Assume that we wish to evaluate an unknown drug P“ por 
to be a powerful sedative and that neither \^c ar‘^ 

placebo controls are available as „.ethodolog.cal tools AH 
able to do is to administer the drug under a variety of ^ 


aoie lu uo is lu uuuumsier me arug uiiuci a vw..>-v 1 nrt 

observe its effects This is in many ways analogous to the i 
«rnWoav In fad 


pendent vanable that wc normally study in psychology n^^^ h\pnotu^ 
example the unknown drug will be sodium amytal, a power 
with indisputable pharmacological action ireoid'^tion 

On givmg the drug the first time, with considerab e P 


thu 
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peripheral blood flow and were given a description of an experimental 
procedure congruent with such a drug study, they would not be hkely 
to show a decrement in performance data However, subjects who were 
run With the drug and such instruchons would presumably yield the 
standard subnormal performance In otlier words, the quasi control of the 
non experiment has allowed us to economically assess the possible effects 
of instructional sets rather than allowing drug inference It is an efficient 
way of clarifying the adequacy of experimental procedures as a prelude 
to the definitive study * 

A somewhat more elaborate procedure would be to instruct subjects 
to simulate | It would be relatively easy to use simulators in a fashion 
analogous to that suggested in hypnosis research Two investigators 
would be employed, one who would administer the medication and 
one who would carry out all other aspects of the study The simulators 
would, instead of receiving the drug, be shown the medication, would 
read exactly the information given to the drug subjects but would be 
told they would not be given the drug Instead, their task would be 
to deceive the other experimenter and to make him think they had 
actually received the drug They would further be told the other expen 
menter was bhnd and would not know they were simulating, if he really 
caught on to their identity, he would disqualify them, therefore, they 
should not be afraid they would give themselves away since, as long 
as they were not disqualified, they were doing well The subject would 
then be turned over to the other expenmenter who would, in fact, be 
blmd as to the true status of the subject The simulating subject, under 
these circumstances, would get no more and no less information than 
the subject receiving the actual drug (except cues of subjective side 
effects from the drug) He would be treated by the expenmenter in 
essentially the same fashion Tins procedure avoids some of the possible 
difficulties of differential treatment inherent in the non expenmcnt Even 
under these circumstances, however, if both groups produce, lets say, 
identical striking alterations of subjective expenence, it would still be 

" Obviously, extreme caution is needed m interprebng differences m performance 
the individuals actually receiving the drug and that of the non-experimental 
control The subjects in a non experiment cannot really be given the idenbcal 
cues and role support provided the subject who is actually taking the drug ^VhlIe 
tke idenbcal instrucbons may be read to him. it is essenbally impossible to treat 
such subjects in the same fashion Obviously the mvesbgator is not concerned 
about side effects, possible dangers, etc A great many cues which contnbule to 
tke demand charactensbcs, including drug side effects, are thus different for the 
subject receiving the achial drug, and differences in performance could be due 
to many aspects of this differential treatment 

i The use of simulators as an alternate e to placebo in ps) chopharmacological 
studies was suggested b> Frederick J Ea’ans 
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total effect can reasonably be ascnbed to demand charactensbc com 

ponents (The paper previously mentioned by Ross et a/ [1962] remits 

precisely such a study with amytal and showed clear cut differences } 

It IS clear that the quasi control of inquiry can only serve to estimate 
the adequacy of the vanous design modifications Inference about t ese 
changes must be based on effects which the modifications are s oivn 
to produce m actual studies of subjects' behavior 
The non expenment can be used in precisely the same 
has the advantage and disadvantage of ehmmating cues from the rug 
experience Here one would explain to a group of subjects draiim rom 
the usual subject pool precisely what is to be done, show them t ® 
that is to be tahen, give them the identical information provided to os 
individuals who actually take the drug, and, finally, ask them to ° 
on the tests to be used as if they had received the drug This 
has the advantage that the experimenter need no ^ 

the subject could have deduced about what was expected an 
these perceptions could then have affected his performance 
of requmng the expenmenter to interpret inquiry data and ® 
assumptions about how presumed attitudes and beliefs could ma i ^ 
themselves on the particular behavioral indices used, the subject pm\ 
the expenmenter with data m a form identical to that provi e 
those individuals who actually take the drug , 

The fact that the non expenmenlal subject yields data m o 


t must not, Iw''^ 

ever, seduce the investigator into believing that the data 
ways equivalent Inference from such a procedure 
demand charactensbc components of the drug effect would nee 


guarded indeed 

Such findings merely indicate that sufficient cues are pt-p— 
situabon to allow a subject to know what is expected and t 


present m the 


but need not, be responsible for the data To illustrate 
if in doing the non expenment one tells the subject he will c 
three sleeping capsules and then asks him to do r“aTzcS'hc ongW 


longed concentrabon, the subject is very like!)- to 


uBcanth 


to perform as though he were quite drows) and jacld a liVc 

subnormal performance TIic fact that these subjects do „^uld 

actual subjects rccemng three sleeping capsules of sodium are 

not negate the possible real drug effects which, m our exa 
known to be powerful The only thing it indicates is I '> {o 

mental procedure allows for an alternative explanation ® 
be refined Again, the non-expenment would . p 

ment if subjects instead of being told that they to 

capsules were told we are investigating a drug des 
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words, in any given context there are a large number of demand charac- 
teristics inherent in the situation and the subject responds only to those 
aspects of the demand charactensbcs which he perceives (there will 
be many cues which are not recognized by a particular subject) and, 
of those aspects of the demand characteristics which are perceived at 
some level by the subject, only some will have a behavioral consequence 
One might consider any given experiment as having demand character 
isUcs which fall into two groups (a) those which will be perceived 
and responded to and are, therefore, active in creating expenmental 
effects (that is, they will operate differentially between groups) and 
(b) those which are present in the situation but either are not readily 
perceived hy most subjects or, for one reason or another, do not lead 
to a behavioral response by most of the subjects Quasi control proce 
dures tend to maximally elicit the subjects responses to demand charac 
tenstics As a result, the behavior seen with quasi control subjects 
may mclude responses to aspects of the demand characteristics which 
for the real subjects are essentially mert All that possibly can be deter- 
mined with quasi controls is what could be salient demand charactens 
hes in the situation, whether the subjects actually respond to those 
same demand characteristics cannot be confirmed Placebo controls or 
other passive control groups such as those for whom demand charactens 
hes are vaned as mdependent vanables are necessary to permit firmer 
mference 


IX. A FINAL EXAMPLE 

The problem of inference from quasi controls is illustrated in a study 
(Evans, 1966, Ome and Evans, 1966) earned out to mvesbgate what 
^sppens if the hypnotist disappears after deep hypnosis has been in 
duced Tins question is by no means easy to examine The hypnotists 
disappearance must be managed in such a way as to seem plausible 
^nd truly accidental in order to a\oid doing violence to the implicit 
^greement betxveen subject and hypnotist that the latter is responsible 
for tile welfare of the former during the course of the experiment Such 
“ situation was finally created m a study requinng two sessions waA 
mbjects previously trained to enter h)’pnosis readily It was explained 
fo them that in order to standardize the procedure all instructions, in- 
cluding the induction and termination of hypnosis, would be earned 
hj tape recording 

The expenmenter’s task was csscnliall} that of a technician turning 
the tape recorder. appl>ing electrodes, presenting cxpcnmcnLal nutc- 
rwls. etc He did not say an) thing throughout the stud) since exer) 
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erroneous to conclude that there is no drug effect Rather one would 
have to conclude that the experimental procedure is inadequate, tha 
the expencnce of the subjects receiving the drug could (but need no ) 
be due to placebo effects Whether this is in fact the case cannot Oc 
estabhshed with this design The only conclusion which can be dram 
IS that the experimental procedure is not adequate and needs to e 
modified Presumably an appropnate modification of the demand c arao 
tenstics would, if there is a real drug effect, eventually allow ° 
difference to emerge between subjects who are receiving drugs an 


subjects who are simulating 

The interpretation of findings where the group of subjects receiv 
dnigs performs differently from those who are simulating also 
caution While such findings suggest that drug effect could not e 
simply to the demand characteristics because it differs from e expe 
tions of the simulators who are not exposed to the real treatmen , 
fact that the simulating group is a different treatment group 
kept m mind Thus, some behavior may be due to the request o s 
late Greater evasiveness on the part of simulating 
could most likely be ascnbed to the act of simulation Grea 
ciousness on the part of a simulator could equally be a 
the peculiar situation into which the subject is placed These o 
tions underline the fact that the simulator, who is a 
IS effective primarily m clarifying the adequacy of the researc 
dure The characteristic of this treatment group is that it 
subjects actively to participate in the expenment in contrast o 
control group which receives the identical treatment omitting ^ 
drug as would be the case when placebos are used of 

The problem of inference from data obtained throug 
quasi controls is seen relatively easily when one attempts o 
the contributions which demand charactenstics might ma e 
total behavior after receiving a drug Clearly the placebo csig 
used IS the most adequate approach This will tell us 
the behavior of those individuals receiving the drug can 
for on the basis of their receiving a substance which 
specific effect but which mimics the side effects when t e pj^ccb® 
mental situation and treatment of the subject arc idcntica 
effect IS the behavioral consequence which results rom ^ 
characteristics which arc (1) perceived and (2) respon e 

T> I /1959) 

• For a discussion of the placebo response, sec Beecher ( jnfluen^ 

( 19G-I ) UndoubtedK it has a large number of components an . 

both situational and personabty factors, particular!) its re a lori . These 
(Es'ans, 1907) is of considerable interest and is b) no mea 
howcs’cr, go bc)-ond die scope of this paper 
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tween these two groups these results do not permit any conclusions 
to be drawn about the nature of the hypnotic phenomenon. The behavior 
of the quasi-controls demonstrates an inadequacy of the deception 
procedure and indicates that the behavior of the hypnotized subjects 
could not be interpreted at face value. 

Having found the deception manipulation wanting we re-examined 
in detail the cues which might allow the subject to become aware that 
the power failure was a planned part of the experiment and that his 
behavior would actually be observed subsequent to it. The kind of nitty- 
gritty details which most investigators fail to publish and which, even 
if they wished to, most editors will delete for the sake of brevity, are 
crucial in this regard: to wit, in the study just described, to facilitate 
observation after the “power failure,” the overhead lights remained on 
and only the bright desk lamp went out as the tape recorder stopped; 
an obvious one-way mirror had always been in the room to which we 
had hoped (in vain) the subject would have adapted; further, a small 
polygraph continued to run in the experimental room after the “power 
failure,” etc. We now redesigned the study with considerable attention 
to such details. The one-way screen was eliminated and replaced by 
another less well-known and carefully hidden means of observation. 
When the power failure occurred all lights in the room were extin- 
guished. Just enough light was filtered into the room through the window 
curtains from daylight— or from a strategically placed parking light out- 
side at night — to make observation possible. Furthermore, physiological 
recordings were made by means of a large Ampex instrumentation tape 
recorder and a small polygraph. Both of these machines stopped, along 
'vith the lights and the tape recorder, with the power failure. (Actually, 
the wires from the subject were run through to another room at some 
distance in order to permit continuous recording of relevant data.) When 
the experimenter left the room the hall lights w'ere out. Every effort 
was made by scrupulous attention to such details to make the power 
failure plausible. 

With tliese revised conditions simulating and hjpnotizcd subjects were 
in another experiment by a blind experimenter. Here five out of 
SIX simulating subjects stopped simulating the moment tlic technician 
left the room, indulging in e.xplorator)' and other private behavior clearly 
indicating that they felt themselves to be unobserved. On the other 
band, tlic hj’pnotizcd group behaved much as they had in the earlier 
^^^q>crimcnt, pulling themselves out of this state only after a considerable 
period of lime had elapsed. Tliis behavior was in line with the data 
provided during an inquir}', indicating dial both real and simulating 
subjects accepted the power failure as genuine. 
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item of instruction was given by means of the tape recorder. Kach subject 
came for two such sessions. During the second session, while the subject 
was deeply hypnotized and tapping his foot in rhythm with hallucinate 
music, the tape recorder came to a grinding halt, the light went out 
on the desk, and the experimenter tried in vain to reactivate the mac me 
by flicking the switch. He then muttered under his breath that the use 
must be blown and dashed from the room. 

The subject’s behavior was observed through a one-way screen 
throughout the experiment in order to determine what would happen 
once the experimenter had left the room. Would hypnosis be terminate 
immediately once the subject was alone as some theories would pre c 
or would it take a period of time for the subject to pull himse 
of hypnosis as one would expect if the condition were some specia 


altered state? . . . j 

Despite the "cuteness” of the deception procedure of the simu a ^ 
power failure there was no certainty whether the subject woul 
it as such. For this reason quasi-controls were run. In 
paradigm simulating subjects were the most appropriate kind or 
control and these were included without the hypnotist being n\ 
which subjects were in fact deeply hypnotizable and which 
hypnotizable individuals instructed to simulate. The findings 
cut: five out of the six deeply hypnotized subjects did not arouse i 
diately when the hypnotist left. They required varying .jj^y 

to apparently pull themselves out of the state. When they di . 
appeared somewhat disoriented and confused. In contrast, five 
fKn c-v l.vnr,osis for the whole 


the sL\ simulating subjects continued to simulate hypnosis fw t e 
half-hour period until the technician returned, presumably 
the fuse. During the subsequent postexperimental inquiry r fgjjurc 
clear that the simulators caught on to the fact that the power 
was contrived and that they behaved os they thought o ^ 
subject would if there were a true power failure. The exped- 

quite cogently pointed out that having been instructed to fool ^ 
menter put them on their guard lest they themselves be ^^ver 

trast, only one of the hypnotized subjects suspected that 
failure was contrived. ^ group 

Because the quasi-control subjects are a special nopU' 

their purpose is not primarily as a comparison w'ith actua su J 
lations; rather, it Is to clarify the adequac)’ of the expenmen^^ fxilui^ 
dure. Since the simulators caught on to the fact that the ^j,|oct5 
was contrived it was plausible that at some level the hypno JZ 


was contrived it was plausible that at some level the hypno ^ 
could have responded to the same cues though they 
such awareness. Therefore despite statistically significant ‘ 
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other sciences have had to do likewise Thus, aerodynamics has had 
to develop conversion factors before data obtained in the wind tunnel 
could be safely apphed to a place in flight Similarly, mference from 
the action of an antibiotic in the test tube to its medical effects on 
the organism depends on recognition that effects m vitro may differ 
from those tn vwo We cannot afford to give up either laboratory re 
search or observation m a naturahsbc setfang Both kinds of data are 
an mtegral part of behavioral science 
In addition to the usual control procedures which are recognized as 
necessary in isolatmg the action of an independent vanable m any ex 
penment, studies with human subjects require a set of controls designed 
to look at the effect of the experimental technique itself These controls 
do not permit a direct mference about the independent variable Rather, 
they are designed to allow the investigator to estimate the effects which 
are due to the situation under which a study is being earned out The 
term quasi-control has been suggested to differenbate these techniques 
from the more typical conbol measures The kinds of quasi conbols 
outlined here all share the feature that they ubhze the abihty of subjects 
to reflect upon the context in which they are being invesbgated, as 
a means of understanding the way in which this context might affect 
their own and other subjects’ behavior Undoubtedly other quasi-con- 
bols Will need to be developed m order to facilitate inference about 
human behavior from one context to another 
While the difficulty of inference from one context to another is recog- 
nized by all scientists, psychology and the other behavioral sciences 
are m a pecuhar position The object of our study is man The implica- 
tions of our research relate to man’s behavior It is not surpnsing that 
our findings are of considerable interest to individuals outside of scien 
bfic disciplines Studies in the behavioral sciences tend increasingly to 
affect policy decisions Even the scientist in pure research may find 
his data quoted as the basis of a decision where he himself vould feel 
there is little relevance \Vhether we welcome this tendency or view 
'vith alarm, it seems likely to continue 

WiUi the increasing interest in and dissemination of knowledge about 
beha\ioral research, it becomes important to see what is needed before 
meaningful generalization is possible This problem is particularly acute 
m cxpcnmcntal work, although the Hawthorne studies (Roclhlisbcrgcr 
and Dickson. 1939) demonstrate that it abo exists m research outside 
of die laboratory Perhaps our responsibility extends bejond our subjects 
and our disciplines, to include a concern with the kinds of generalizations 
''hich maj be drawai from our work TIic leap is one which others 
are so eager to make that wc can hardly aaoid considering it ourscKcs 



Simificant differences were again obtained but, in contrast to 
findings of the first study, those from the second study allow meaning 
inference The behavior of the quasi-controls clearly indicates th 
the power failure was accepted as genuine and therefore it is p ausi 
to accept the behavior of the hypnotized individuals at face value 
It will be clear that the purpose served by the quasi-controls was si p y 
to determine whether or not there were sufficient cues in t e exp 
mental situation to allow the subjects to surmise that the power 
was staged rather than spontaneous However, this issue is cruci 
we hope to draw inference to a situation which is perceive / 
subject as extra-expenmental Though it entailed a great ea o . 
to test the adequacy of the deception manipulation, without quasi c 
we would have had no empirical test of the procedure's adequacy 
not possible, without techniques such as this, to evaluate the m 
presented to the subject in an experimental situation, and ye 
tent to which such cues are present will determine the km o i 
which legitimately can be drawn from expenmental findings 


X. CONCLUSION 

Research with human subjects introduces a new set ^ct 

because the subjects are sentient beings who are affecte y 
of observation and, particularly in expenmental contexts, a ^ 
means neutral to the outcome of the study The kinds of vana 
affect subjects’ perceptions about the expenment, its 50 

one hopes to find, how they may perform as good su 
forth — especially those not specifically communicated but ra ji 

m what the subject learns about the experiment and t e P™ 
self — have been termed the demand charactenstics of t e ex^ ^ 
situation The nature of the effects of demand characterful 
that certain findings may be observed — and may even situation 
m laboratory situations — but be specific to the expenme 
In order to make inference beyond the expenmenta introduced 

nomena occurring outside the laboratory the possible e haic 

by demand characteristics must be considered These 1 |jgj.jitory 
led some to suggest that psychologists must leave * ^ it is dc- 

conduct research exclusively in naturalistic settings 0 jjjigni ^ 
sirable to obtain data of this kind, but the expenmen a p ^ujiougfi 
mains the most powerful tool of analysis we have ^ to 

%%e must recognize the problem of inference from one con 
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Chapter 6 


INTERPERSONAL EXPECTATIONS: 

Effects of the Experimenter's Hypothesis* 


Robert Rosenthal 
Harvard University 


The social situation which comes into being when a behavioral scien- 
tist encounters his research subject is a situation of both general and 
unique importance to the behavioral sciences Its general importance 
derives from the fact that the interaction of experimenter and subject, 
like other txvo person interactions, may be investigated empirically x\ith 
a view to teaching us more about dyadic interaction in general Its 
unique importance denves from the fact that the interaction of experi- 
menter and subject, unlike other dyadic interactions, is a major source 
of our knowledge in the behavioral sciences 
To the extent that wc hope for dependable knowledge in the bcha\- 
loral sciences, ue must ha\c dependable kno^^ ledge about the expen- 
mcnter-subject interaction spccificall\ Without an understanding of the 
data collection situation x\c can no more hope to acquire accurate infor- 
mation for our disciplines than astronomers and zoologists could hope 
to acquire accurate information without their understanding the opera- 
tion of their telescopes and microscopes It is for thc*sc reasons that 
increasing interest has been shown m the inxcstigalion of the experi- 
menter subject interaction s)’stem And the outlook is anything but bleak 

• Preparation of tius chapter and much of the rwarch Mimmirized here uat 
lupported research pranls (G— 170S5, Ci-2IS20, CS— 177, CS— 1, and CS— 1< II) 
horn llie Division of Social Saeners of the Nation'll Saence Poundabon 

ISl 
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do tend to occur m a biased manner By that we mean that, more 
often than we would expect by chance, when errors of observation occur 
they tend to give results more in the direction of the observer’s hypothe 
sis (Rosenthal, 1966) 

1 Recording Errors As data collectors observe the behavior of their 
subjects, their observations must in some way be recorded It is no 
revelation to point out that errors of recordmg do occur, but it may 
be of interest to try to obtain some estimates of the incidence of such 
errors Table I shows four such estimates based on an older study of 


TABLE I 

Recordinc Errors in Foxm Experiments 


Study 

Obseners Recordings 

Errors 

Error % 

Bias % 

1 « 

28 

11,125 

126 

1 13% 

68% 

2 ‘ 

30 

3,000 

20 

07% 

75% 

3 « 

11 

828 

G 

72% 

67% 

4 

34 

1,770 

30 

1 69% 

85% 

Combined 

103 

16,723 

182 

1 09% 

71% 


* Kennedy and Uphoff, 1939 

* Rosenthal, Friedman, Johnson, et al , 10G4 

* Perstngcr, Knutson, and Rosenthal, 10G8 

‘‘ Weiss, 1957 

errors in recording responses to a telepathy lash, two more recent studies 
of errors in recording responses to a person perception task, and one 
recent study of errors m recording responses to a numcrosity cstimition 
task The next to last column shows the range of misrccordmg rates 
to be in tlic neighborhood of one per cent and the last column shows 
that perhaps over Uvo-thirds of the errors tliat do occur arc biased 
in the direction of the obseners Iiypolhcsis 

A more preliminary assessment of computational errors is summanzed 
in Table II About ^vo thirds of the experimenters err computationally, 
Uiough it seems safe to suggest that, given enough computations to 
perform, all expenmenters will make computational errors More inlcrcst- 
\ng IS \hc combined finding that nc*w\> three owl of four expenmenters, 
when tlicy do err computational!), err in the direction of tbcir 
h)’pothcsis 

In general, the magnitudes of errors, lK)lh biased and unbiased, tend 
to be small, and the overall effects of rcconling and computational errors 
on grand means of different treatment conditions tend to be tnvaal 
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It does seem that we can profitably learn about those effects which 
the behavioral scientist unwittingly may have on the results of his 
research 


I. UNINTENDED EFFECTS OF THE EXPERIMENTER 

It IS useful to think of two major types of effects, which the behavioral 
scientist can have upon the results of his research The first type opera es, 
so to speak, in the mind, m the eye, or m the hand of the investigator 
It operates without affecting the actual response of the human or anima 
subjects of the research, it is not mteraclional The second type o expen 
menter effect is interactional, it operates by affecting the actual response 
of the subject of the experiment It is a sub-type of this latter e ec^ 
the effects of the investigator’s expectancy or hypothesis on the resu ^ 
of his research, that will occupy most of the discussion First, 
some examples of other effects of the investigator on his researc 
be mentioned 

A Observer Effects 

In any science, the experimenter must make provision for the 
observation and recording of the events under study It is no ^ 
so easy to be sure that one has, m fact, made an accurate o 
That lesson was learned by the psychologists, who were gra 6 
leam it, but it was not the psychologists who focused attention 
ongmally It was the astronomers ^ tJje 

Just near the end of the 18th century, the royal astronomer 
Greenwich Observatory, Maskelyne, discovered that his ass^tan • 
brook, was consistently “too slow” in his observations of the m 
of stars across the sky Maskelyne cautioned Kmnebrook a ou 
rors” but the errors continued for months Kmnebrook was disc 
The man who might have saved that job was Bessel, the 
at Komgsberg, but he was 20 years too late It was not^unti 
he arrived at the conclusion that Kinnebrook’s “error was p 
not willful Bessel studied the observations of stellar discO' 

a number of senior astronomers Differences in observation, 
ered, were the rule, not the exception (Boring, 1950) bserv3^“^"^ 

This early observation of the effects of the scientist on the of 

of science made Bessel perhaps the first student of the ps>c 
scientists More contemporary research on the psychology jjie) 

has shown that, while observer errors arc not neccssan y s 
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munity We are free to agree or disagree with any specific interpretation 
Not so with the case of the observations themselves Often these are 
made by a smgle mvestigator so that we are not free to agree or disagree 
We can only hope that no observer errors occurred, and we can, and 
should, try to repeat the observations 

Examples of mterpreter effects in the physical, biological, and behav 
loral sciences are not hard to come by, and to an earher theoretical 
discussion and mventory of examples (Rosenthal, 1966) we need only 
add some recent mstances In the physical sciences, Polanyi (1967) 
refers to the possible mterpretations of those data which appeared to 
support Velikovsky’s controversial theory dealmg with the history of 
our planet and the ongm of the planet Venus In the same paper, Polanyi 
gives us other examples of mterpreter effects and places them all mto 
a broad conceptual framework m which the antecedent plausibihty plays 
a promment role 

In a disarming retraction of an earlier interpretation, Bradley ( 1968, 
437) told how the microscopic particles he found, turned out not to 
be the unmmerahzed fossil bactena he had ongmally believed them 
to be The mmute spheres, it turned out, were artifactually formed 
fluonte ‘ I was as completely taken in as Don Quixote 

Carlson and Armelagos (1965) discuss a considerably more macro 
scopic “find’ reported m the literature of paleopathology They argue 
convincmgly that the prehistonc curved bark bands earlier interpreted 
as orthopedic corsets were actually hoods for Indian cradleboards Addi- 
tional recent discussions of mterpreter effects in the behavioral sciences 
can be found m Honorton ( 1967) and Shaver ( 1966) 

C. Intentional Effects 

It happens sometimes m undergraduate laboratory science courses 
that students “collect” and report data too perfect to be true (That 
probably happens most often when students arc taught to be scientists 
by bemg told what results they must gel to do well m the course, 
rather than bemg taught the logic of scicnbfic mquiry and the value 
of being quite open eyed and open-minded ) Unfortunately, the history 
of science tells us that not only undergraduates have been dishonest 
in science 

Intentional effects, though rare, must be regarded as part of the m\en- 
tory of the effects of the imcsligalor himself, and some histoncall) im 
portant cases from the physical, biologcal, and behavioral sciences ln\e 
been desenbed in detail cIse^^hcre (Rosenthal, 1966) Additional 
anecdotal e\adcncc is presented for the case of belnvioral research b) 
Roth (1965) and Gardner (1966) Martin Gardner, professional mag- 
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TABLE II 


Biased Computahonai. Ebrors in Three Studies 


Study 

Experimenters 

Total N Erring N 

Erring % 

Bias % 

Laszlo and Rosenthal, 1967 

3 

3 

100% 

100% 

Rosenthal, Friedman, Johnson, 
et al , 1964 

Rosenthal and Hall, 1968* 

30 

1 

18 

1 

60% 

100% 

67% 

100% 

Combined 

34 

22 

65% 

73% 


“ For a sample of five research assistants performing 5,012 calculations 
scriptions there were 41 errors detected on rccheck for an 82 per cent error rs 

A few of the expenmenters studied, however, made errors sufficiently 
large and sufficiently non canceling to have affected the 
an experiment m which they were the only data recorders an 
processors 

Successive independent checking and rechecking of a set of o s 
tions can give us whatever degree of accuracy is needed, thoug 
absolute zero level of error seems an unhkely goal to achieve 
historical and theoretical discussion of observer effects and their con 
IS available elsewhere (Rosenthal, 1966) 


B. Interpreter Effects 

The mterpretation of the data collected is part of the t^oral 

and a glance at any of the technical journals of contemporary ® ^ 

science will suggest strongly that, while we only rarely d^ate 
other’s observations, we often debate the interpretation of of 

tions It IS as difficult to slate the rules for accurate but 

data as it is to state the rules for accurate observation o 
the vanety of interpretations offered in explanation of 
imply that many of us must turn out often to be wrong ^p,f5call>' 
of science generally, and the history of psychology more 
suggest that more of us are wrong longer than we need to c 
wc hold our theories not quite lightly enough The common 
of theory monogamy has its advantages, however It does 
vatcd to make more crucial observations In any case, interp farmer 
seem less serious than observer effects The reason is t a 
are public while the latter are pnvatc Given a set o o 
their interpretations become generally available to the sci 
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So far we have seen that both tiie sex of the expenmenter and the 
sex of the subject can serve as significant determinants of the way in 
which the mvestigator conducts his research In addition, however, we 
find that when the sex of the expenmenter and the sex of the subject 
are considered simultaneously, certain interaction effects emerge Thus, 
male expenmenters contacting female subjects, and female experimenters 
contacting male subjects, tend to require more time to collect portions 
of their data than do male or female experimenters contacting subjects 
of the same sex (Rosenthal, 1967) This tendency for opposite sex dyads 
to prolong their data collection interacfaons has also been found in a 
verbal conditionmg expenment by Shapiro ( 1966) 

Other mteresting interaction effects occur when we examine closely 
the sound motion pictures of male and female expenmenters contactmg 
male and female subjects Observations of expenmenters' fnendliness 
were made by two different groups of observers One group watched 
die films but did not hear the sound track The other group listened 
to the sound track but did not view the films From the resulting ratmgs, 
a measure of motor or visual fnendbness and an independent measure 
of verbal or auditory fnendbness were available (The correlation be 
t\veen ratmgs of fnendliness obtained from these independent channels 
was only 29 ) Among male experimenters, there was a tendency (not 
statistically significant) for their movements to show greater fnendliness 
than their tone of voice, and to be somewhat unfnendly toward their 
male subjects in the auditory channel of communication It was among 
the female experimenters that the more stnkmg effects occurred The 
females were quite fnendly toward their female subjects m the visual 
channel but not in the auditory channel With male subjects, the situation 
was reversed significantly Though not fnendly in the visual mode, fe- 
male expenmenters showed remarkable friendliness m the auditory chan- 
nel when contacting male subjects 

The quantitative analysis of sound motion pictures is not yet far 
enough developed that we can say whether such channel discrepancy 
m the communication of fnendbness is generally characteristic of women 
m our culture, or only of advanced female students m psychology, or 
onl) of female imestigalors conducting expenments in person percep- 
tion Perhaps it would not be farfetched to attnbutc the obtained channel 
discrepancy to an ambivalence over how fnendly they ought to be 
Quite apart from considerations of unintended effects of the expen- 
menter, such findings may have some relevance for a better understand* 
mg of communication processes m general 

Tliough the sex of the expenmenter does not alvvaj-s affect tbc per- 
formance of tlie subject, in a great many cases it docs Gall and Mendel- 
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cian and editor of the “Mathematical Games” section of the Scicfifi/tc 
American, provides an excellent summary of how behavioral scientists 
can be deceived by their over-eager subjects of research in dermo optical 
perception, and also how they can prevent such deception 

Intentional effects, mterpreter effects, and observer effects all operate 
without the investigator’s influencing his subject’s response to the e'^en 
mental task In those effects of the experimenter himself to which 
now turn, we shall see that the subject’s response to the expenmenta 
task is influenced 


D Biosocial Effects 

The sex, age, and race of the investigator have all been found to 
affect the results of his research (Rosenthal, 1966) What we o no 
know and what we need to learn is whether subjects respond di 
simply to the presence of experimenters varying in these biosocia ^ 
tnbutes or whether experimenters varymg m those attributes e 
differently toward their subjects and, therefore, obtain pn 

from them because they have, m effect, altered the expenmental si a i 
for lliAir eiiViioolc Co cncrtTPcts that male and t® 


for their subjects So far the evidence suggests that male 
experimenters conduct the “same” person perception expei 
differently so that the different results they obtain may be 


to those unintentionally different manipulations Male 
for example, were found m two experiments to be more fnen y 
their subjects than female expenmenters (Rosenthal, 1967) enters 

Biosocial attributes of the subject can also affect the 
behavior, which, in turn, mTy affect the subject’s responses In 
for example, the interactions between expenmenters and their 
were recorded on sound films It was found that only 12 pe^ 
the experimenters exer smiled at their male subjects, while 70 p 
of the experimenters smiled at their female subjects^ Smihng^ 
expenmenters, it was discovered, affected the subjects’ tint 

this cxidence and from some more detailed analyses which 
female subjects may be more protectively treated by their exper . 
(Rosenthal, 1966, 1967), it might be suggested that in the psyc 
expenment, chivalry is not dead This news may be 
and It IS interesting psychologically, but it is very disconcerting 
logical!) Sex differences are well established for many ^nd^ 

But a question must now be raised as to whether sex genrt 

emerge from ps)chological expenments are due to the su j^entrt 
morpholog), enculturation, or simply to the fact that sen^^’ 

treated his male and female subjects differently so t , > 
the\ \\ ere not rcalb in the same expenment at all 
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a more anxious expenmenter cannot conduct just the same experiment 
as a less anxious expenmenter It appears that m experiments which 
have been conducted by just one expenmenter, the probabihty of suc- 
cessful rephcabon by anodier mvesUgator is likely to depend on the 
similanty of his personality to that of the ongmal mvesbgator 

Anxiety of the expenmenter is just one of the expenmenter vanables 
affectmg the subjects* responses m an unintended manner Crowne and 
Marlowe (1964) have sho^vn that subjects who score high on their scale 
of need for approval tend to behave m such a way as to gam the ap- 
proval of the expenmenter Now there is evidence that suggests that 
experimenters who score high on this measure also behave in such a 
way as to gam approval from their subjects Analysis of filmed mterac- 
tions showed that expenmenters sconng higher on the Marlowe-Crowne 
scale spoke to their subjects m a more enthusiastic and a more friendly 
tone of voice In addition, they smiled more often at their subjects and 
slanted their bodies more toward their subjects than did expenmenters 
lower m the need for approval 

Earher research by Towbm (1959) has shown that the exammers 
power to control his patient’s fate can be a partial determmant of the 
patient’s Rorschach responses, though the status of the examiner, inde- 
pendent of his power to control the patient’s desbmes, had little effect 
We might suppose that a Roman Catholic pnest would obtain different 
responses to personal questions asked of Roman Cathohc subjects than 
would a Roman Cathohc layman That was the question ad^-essed m 
an experiment by Walker, Davis, and Firetto ( 1968) They had a layman 
and a pnest, each garbed sometimes as lajTnan and sometimes as pnest, 
admmister a senes of personal queshons to male and female subjects 
The results were complex but interesting male and female subjects re- 
spondmg differenbally not so much to pnest versus layman but rather 
to whether the pnest and layman were playing their true roles or simulaf 
mg those roles ^Vhlle this study showed no simple effect of being con 
tacted by a pnest as opposed to a layman, an earlier study did show 
such differences (Walker and Firelto, 1965) 

"Warmer" expenmenters have also been found often to obtain quite 
different responses from their subjects than “cooler” expenmenters Some 
of the more recent support for this proposition comes from the work 
of Engram (1966) with children and of Goldblatt and Schackner (1963) 
'valh college students These latter workers found that their subjects’ 
iudgments of affect in photognphs were dramaticrllv influenced bj the 
degree of fnendlmess shown by the dali collectors A pionecnng studj 
b> Malmo, Boag and Smith (1957) showed that walhin-cxpcnmcntcr 
'Tinabon could also serve as a powerful unintended determinant of sub- 
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sohn (1966) found, for example, that male experimenters elicited more 
creatiie problem solutions than did female experimenters and that, in 
general, female subjects were more affected m their performance thm 
male subjects by the sex of the experimenter Cieutat (1965), on the 
other hand, found that female experimenters ehcited intellectual perfor 
mance from children supenor to that obtained by male expenmenteis 
In addition, children tended to perform better for examiners o e 
opposite sex Follow up research by Cieutat and Fhck (1967), 
found these effects to be appreciably dimmished Glixman (1 ) 

presents partial support for the proposition that the sex of the expen 
menter may interact with the type of task required to determine, in 
part, the subject’s response, and Kmtz, Delprato, Mettee, Persons, an 
Schappe (1965) have data suggesting that sex of experimenter may 
be a variable affecting the maze behavior of albino rats 

The race or ethnic grouping of the experimenter also often a ec 
the subject’s response Vaughn (1963) found that Maori and 
experimenters differentially affected the responses of Maon ana 
school children such that the Maori children preferred figures of t 
own race considerably more when the experimenter was also a 
W enk (1966) found that Negro subjects scored appreciably hig 
nonlanguagc tests of intellectual functioning when the exammer w 
Negro rather than white Summers and Hammonds (1966) loun 
the presence of a Negro investigator considerably decreased self , 
of anti-Negro prejudice But m a more chnical context, Womac 
Wagner (1967) found only personal charactenstics other than 
affect the patients’ responses to professionally identified lntervle^' 


E Psychosocial Effects 

Expenmenters who differ along such personal and social 
as anxiety, need for approval, status, and warmth tend to obtain 
responses from their research subjects, and a summary of ^ 
of these and other vanables is available (Rosenthal, 1966) 
for example, does the more anxious experimenter do in the expe 
that leads his subjects to respond differently? We might exp^ ^ 
anxious experimenters to be more fidgety, and that is just "na 
are Expenmenters sconng higher on the Taylor Manifest to 

base been observed from sound motion pictures (Rosenthal, 
show a greater degree of general body activity and to have a les 
nant tone of voice What effects just such behavior on the pa 
expenmenter will have on the subjects’ responses depends 
on the particular expenment being conducted and, very likely, o 
characteristics of the subject as well In any case, vve must as 
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which were filmed, for example, subjects contacted by expenmenters 
whose behavior changed as described rated stimulus persons as less 
successful (Rosenthal, 1966) 

2 Subject Behavior The experimenter-subject communication sys- 
tem IS a complex of intertwining feedback loops The experimenters 
behavior, we have seen, can affect the subjects response But the sub- 
jects behavior can also affect the experimenter’s behavior, which m 
turn affects the subject’s behavior In this way, the subject plays a 
part in the indirect determination of his own response The experi- 
mental details are given elsewhere (Rosenthal, 1966, Rosenthal, Kohn, 
Greenfield, and Carota, 1965) Briefly, in one experiment, half the 
expenmenters had their expenmental hypotheses confirmed by their 
first few subjects, who were actually accomplices The remammg 
expenmenters had their expenmental hypotheses disconfirmed This con- 
firmation or disconfirmation of their hypotheses affected the experi- 
menters’ behavior sufficiently so that from their next subjects, who were 
bona fide and not accomplices, they obtained significantly different re- 
sponses not only to the experimental task, but on standard tests of per 
sonahty as well These responses were predictable from a knowledge 
of the responses the experimenters had obtained from their earlier-con- 
tacted subjects 

There is an mterestmg footnote on the psychology of the accomplice 
which comes from the experiment alluded to The accomphces had been 
trained to confirm or to disconfirm the experimenter’s hypothesis by 
the nature of the responses they gave the expenmenter These accom- 
phces did not, of course, know when they were confirming an expen- 
menter’s hypothesis or, indeed, that there were expectancies to be con- 
firmed at all In spite of the accomplices’ training, they were significantly 
affected in the adequacy of their performance as accomplices by the 
expectancy the experimenter had of their performance, and by whether 
the expenmenters hypothesis was being confirmed or disconfirmed by 
the accomphces’ responses We can think of the accomplices as expen- 
menters, and the expenmenters as the accomplices’ targets or “victims ” 
It is interesting to know that expenmental targets are not simply affected 
hy expenmental accomplices Tlic targets of our accomphces, like the 
subjects of our expenmenters, arc not simply passne responders They 
“act back ” 

3 Experimental Scene One of the things tint happens to the experi- 
menter which may affect his bcIiaMor toward his subject, and tints the 
subject’s response, is that he falls heir to a sjiccific scene in which to 
conduct Ins experiment Ricckcn (1962) has pointed out how’ much 
there is that wc do not know about the effects of the phj'sical scene 
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lects’ responses A particular data collector’s variations in feeling state 
wore found to be related to his subjects’ physiological responses When 
the erperimenter had a ‘bad day,’ his subjects’ heart rate showed sig 
nificanUy greater acceleration than when he had a “good day’ Surpris 
ingly, the data collector’s feeling state was not particularly related to 
his own physiological responses 

F. Situational Effects 

The degree of acquaintanceship between experimenter and subject 
the experimenter’s level of expenenee, and the things that happen to 
him before and dunng his interaction with his subject have all 
shown to affect the subjects responses (Rosenthal, 1966) Most recent y 
for example, Jourard (1968) has shown that experimenters 
quamted with their subjects and more open to them obtain not on X 
more open responses we might expect on the basis of reciprocity, 
also obtain superior performance in a paired associate learning tas 
1 Experimenter Experience The kind of person the expenmen 
IS before he enters his laboratory can m part determine the response 
he obtains from his subjects From the observation of expenmen er 
behavior dunng their interaction with their subjects there are some c u ^ 
as to how this may come about There is also evidence that the 
of person the expenmenter becomes after he enters his 
alter his behavior toward his subjects and lead him, therefore, to o 
different responses from his subjects i, e is 

In the folklore of psychologists who conduct experiments, t 
the notion that sometimes, perhaps more often than we woul 
subjects contacted early in an expenment behave 
jects contacted later There may be something to this bit of ore 
if we make sure that subjects seen earlier and later in an 
come from the same population The difference may be due to c 
over the course of the expenment in the behavior of the exp^n^ 
From what we know of performance curves, we might pr® 
a practice effect and a fatigue effect on the part of the 
There is evidence for both In expenments which were filmed 1 
thal, 1966), expenmenters became more accurate and faster in ® 
mg of their instructions to their later-contacted subjects Tna 
simply to be a practice effect In addition, expenmenters 
bored or less interested over the course of the expenment as o _ ^ 

from their behavior m the expenmental interaction As we mig 
predict, expenmenters became less tense ivith more course 

changes which occur m the expenmenters’ behavior dui^g * 
of their expenment affect their subjects’ responses In the c pt 
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More specifically, personality differences among prmcipal mvestiga- 
tors, and whether the principal investigator has praised or reproved 
the experimenter for his performance of his data coUectmg duties, affect 
the subjects* subsequent perception of the success of other people and 
also affect subjects’ scores on standardized tests of personahty (eg, 
Taylor Manifest Anxiety scale) 

In one experiment, there were 13 prmcipal investigators and 26 experi- 
menters The prmcipal mvestigators first collected their own data and 
it was found that their anxiety level correlated positively with the ratings 
of the success of others (pictured m photographs) they obtamed from 
their subjects (r = 66, p = 03) Each prmcipal investigator was then 
to employ two research assistants On the assumption that pnncipal 
mvestigators select research assistants who are significantly like or sig 
nificantly unlike themselves, the two research assistants were assigned 
to principal mvestigators at random Tliat uas done so that research 
assistants’ scores on the anxiety scale would not be correlated with their 
principal investigator’s anxiety scores The randomization was successful 
m that the prmcipal mvestigators’ anxiety correlated only 02 with the 
anxiety of their research assistants 

The research assistants then rephcaled the pnncipal investigators ex- 
periments Remarkably, the pnncipal mvestigators level of anxietj also 
predicted the responses obtamed by their research assistants from their 
new samples of subjects (r = 40, p - 07) The research assistants’ own 
level of anxiety, while also posiuvely correlated with their subjects’ re- 
sponses (r = 24), was not as good a predictor of their subjects re- 
sponses as was the anxiety level of the pnncipal mvesbgators Something 
in the covert communication between the pnncipal investigator and his 
research assistant altered the assistant’s behavaor when he subsequently 
contacted his subjects We kmow that the effect of the pnncipal m\esb- 
gator was mediated in this indirect way to his assistant’s subjects, be- 
cause the principal mvesbgator had no contact of his own unth those 
subjects 

Other expenments show that the data obtamed by the expenmenter 
depend m part on whether the pnncipal invcsigator is male or female, 
whether the pnncipal investigator makes the expenmenter self-conscious 
about the expenmental procedure, and whether the pnncipil investigator 
leads the expenmenter to believe he has himself performed well or 
poorly at the same task the expenmenter is to administer to his own 
subjects The evidence comes from studies m person perception, verbal 
conditioning, and motor skills (Rosenthal, 1966) 

As wc would expect, these effects of the pnncipal investigator on 
his assistant’s subjects are mediated by tlie effects on the assistants 
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in which an e-cpenmental transachon takes place We know little enough 
about how the scene affects the subject’s behavior, we know even less 
about how the scene affects the expenmenter’s behavior , , 

The scene in which the expenment takes place may affect the subjects 
response in tsvo ways The effect may be direct, as when a subjec 
judges others to be less happy when his judgments are made m 
“ugly- laboratory (Mmtz, 1957) Or, the effect may be indirect, as « hen 
the scene influences the expenmenter to behave different y an 
change in the expenmenter’s behavior leads to a change m t e su jec 
response Evidence that the physical scene may affect the 
behavior comes from some data collected with Suzanne Woo sey^ 
had available eight laboratory rooms which were varied as to e p 
fessionalness,” the ‘orderliness,” and the “comfortableness o t 
pcarance The 14 experimenters of this study were random y assi^ 
to the eight laboratories Experimenters took the experiment signi c 
more seriously if they had been assigned to a laboratory w ic 
both more disordered and less comfortable These 
graduate students m the natural sciences or in law school Vet ^P 
felt that scientifically serious business is earned on best m ^ 
and severely furnished laboratory which fits the stereotype o e 
list’s ascetic pursuit of truth - gx 

In this same experiment, sub)ects described the behavior o ^ 
penmenter dunng the course of the expenment Expenmen e 
had been assigned to more professional appeanng labora on 
described b) their subjects as significantly more expressive-voi 
expressive faced, and as more given to the use of hand ges re 
were no films made of these expenmenters interacting xvith 
so \vc cannot be sure that their subjects’ descnptions 
There is a chance that the experimenters did not really oe 
senbed but that subjects m different appeanng laboratories 
their expenmenters differently because of the operation of wn e . ^j 
The direct observation of expenmenters’ behavior in differcn p 


contexts should clear up the matter to some extent earned 

4 The Principal Investigator More and more researc 
out in teams and groups so that the chances are ^Io^r 

one expenmenter will be collecting data not for himse ^ 

and more there is a chance that the data are bemg co 

principal imcsligator to whom the expenmenter IS responsi 

data are presented clseuhere (Rosenthal, 1966), but ere ^ 
said that the response a subject gives his expenmenter ^ 

mined in part by the kind of person the principal in\ cs g 
by the nature of his interaction xxith the expenmenter 
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m a Rorschach study, Marvvit (1968) found that experimenters whose 
vocal behavior was more hostile, elicited significantly more hostile 
behavior from their subjects Finally Klinger (1967) has shown that 
even when based entirely on nonverbal cues an experimenter who ap 
peared more achievement motivated elicited significantly more achieve 
ment motivated responses from his subjects 


II THE EXPERIMENTER’S EXPECTANCY 

In the discussion just concluded we have considered briefly some 
sources of artifact deriving from the expenmenter himself We have 
seen that a vanety of personal and situational variables associated with 
the experimenter may unintentionally affect the subjects responses Our 
discussion was not exhaustive but only illustrative and a number of 
sources are available for obtainmg a more complete picture (Krasner 
and UUman, 1965, Maslmg 1960, 1966, McGuigan, 1963, Rosenthal, 
1966, Sarason, 1965, Sattler and Theye 1967, Stevenson, 1965, Zax, 
Stncker, and Weiss 1960) We turn our attention now to a somewhat 
more detailed consideration of another potential source of artifact asso 
ciated with the experimenter — his research hypothesis 

The particular expectation a scientist has of how his expenment will 
turn out IS variable, depending on the expenment being conducted 
but the presence of some expectation is virtually a constant in science 
The independent and dependent vamblcs selected for study by the 
scientist are not chosen by means of a table of random numbers They 
are selected because the scientist expects a certain rchtionship to emerge 
among them Even in those less carcfullj planned examinations of rcla 
honships called “fishing expeditions” or, more fonmll^, “exploratorj 
analyses,’ the expectation of the scientist is reflected m the selection 
of the entire set of vanables chosen for examination Exploratory analy 
ses of data, hke real fishing expeditions do not take place in randomly 
selected pools 

These expectations of the scientist arc likely to affect the choice of 
the experimental design and procedure in such a waj as to increase 
the likelihood that his expectation or hjpothcsis will be supported Tint 
>s as it should be No scientist would select intentional!) a procedure 
likel) to show his h)'polhcsis in error If he could too eosih think of 
procedures that would show this, he would be Iiktlj to rc\isc his h) 
pothcsis If the scclclion of a research design or procedure is regard^ 
h) another scientist as too “biased” to be a fair test of the h\*pothesis 
he can lest Ujc 11 ) 7 ) 01)10513 cmplo)ing oppositcU biased procedures or 
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bchiMor toward his subjects Thus, experimenters who have been ratdc 
more self consaous by their principal investigator behave less mur 
teously toward their subjects, as observed from films of their interactions 
with their subjects In a different experiment, involving this time a lerM 
conditioning task experimenters who had been given more favoraOlc 
evaluations by their principal investigator were described by their subse 
niicntly contacted subjects to be more casual and more courteous es 
same experimenters, probably by virtue of their altered behavior 
tlicir subjects, obtained significantly more conditioning respomes ro 
their subjects All ten of the experimenters who had been more ^ 

exaluated by their principal investigator showed conditioning e c 
among their subjects, but only five of the nine experimenters w o 
unfa\orably evaluated obtained any conditioning 


G Modelmg Effects 

From the fields of survey research, child development, 
chology and from laboratory expenments, there is a reasons e a 
of evidence to suggest that the nature of the data collectors 
pcrform'incc may be a nontrivial determinant of his subjects su seq 
task performance (Rosenthal, 1966) Though most of the evi 
such modeling effects comes from studies in which the 
subject contact is very brief, there are some studies, usually of ® 
study’ \ancty that are based on more prolonged contact One o 
the classic study of Escalona (1945), shows that modeling e e 
not depend on verbal communication The subjects were 50 ba ic • 
of them less than one year old On alternate days they 
orange juice and tomato juice and many of the babies ran 
heartily of one juice than of the other It turned out that ic^ 
who fed the babies also had marked preferences for one 
the other and that babies fed by orange juice preferrers, preferre ^ 
juice while babies fed by tomato juice preferrers, preferred 
When babies were reassigned to new feeders with a different pr^ 
the babies tended to change their preference to coincide w^^^ f;agan 
the new feeder In another long term experiment, ^ando . 

(1966) found that first graders taught by teachers whose tlic 

mg was "reflects e" became 5.gn.ficantly ,oacl.e« 

course of the school )car relaluc to the children taught > 
whoscdecision-makingwas“impuIsivc” . Dam^nl 

In a recent report of a more short-term interpersonal con a , 

(196S) found that the experimenters degree of of 

phrases of a phrase-association test was significantly pmt i * 
jccts* subsequent degree of disturbance on hostile pnms 
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erable evidence for the operation of interpersonal self fulfilling prophe- 
cies This evidence, while ranging from the anecdotal to the experimen 
tal, with emphasis on the former, permits us to begin consideration 
of more recent research on expectancy effects with possibly more than 
very gentle pnors (Mosteller and Tukey, 1965) The literatures referred 
to have been reviewed elsewhere (Rosenthal, 1964a, b, 1965, 1966, Rosen- 
thal and Jacobson, 1968), but it may be of interest here to give one 
illustration from experimental psychology The example is one known 
generally to psychologists as a case study of an artifact in ammal re 
search It is less well known, however, as a case study of the effect 
of expenmenter expectancy While the subject sample was small, the 
expenmenter sample was very large indeed The case, of course, is that 
of Clever Hans (Pfungst, 1911) Hans, it will be remembered, was the 
horse of Mr von Osten, a German mathematics teacher By means of 
tapping his foot, Hans was able to add, subtract, multiply, and divide 
Hans could spell, read, and solve problems of musical harmony To 
be sure, there were other clever animals at the time, and Pfungst tells 
about them There was ‘ Rosa,” the mare of Berlm, who performed simi- 
lar feats in vaudeville, and there was the dog of Utrecht, and the reading 
pig of Virginia All these other clever ammals were highly framed per 
formers who were, of course, intentionally cued by their trainers 
Von Osten, however, did not profit from his animals talent, nor did 
It seem at all likely that he was attempting to perpetrate a fraud He 
swore he did not cue the animal, and he permitted other people to 
question and to test the horse even without his being present Pfungst 
and his famous colleague, Stumpf, undertook a program of systematic 
research to discover the secret of Hans^ talents Among the first discov- 
enes made was that if Hans could not see the questioner, then the 
horse was not clever at all Similarly, if the questioner did not himself 
know the answer to the question Hans could not answer it either Still, 
Hans was able to answer Pfungst’s questions as long as the investigator 
was present and visible Pfungst reasoned that the questioner might 
m some way be signaling to Hans when to begin and when to stop 
tapping his foot A forward inclination of the head of the questioner 
would start Hans tapping, Pfungst observed He tried then to incline 
his head for\vard without asking a question and discovered that this 
was sufficient to start Hans tapping As the experimenter straightened 
up, Hans would stop tapping Pfungst then tned to get Hans to stop 
tapping by using \cry slight upward motions of the head He found 
that even the raising of his ejebrows was sufficient In fact, even the 
dilation of the questioner’s nostnis \\as a cue for Hans to stop lapping 
^VIlcn the questioner bent forward more, the hoisc would tap faster 
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less biased procedures by which to demonstrate the greater value 
his hypothesis The designs and procedures employed are, to a great 
extent, public knowledge, and it is this public character that permits 
relevant replications to serve the required corrective function 
The major concern of this chapter is with the effects of me 
menters expectation on the responses he obtains from his subjects 
consequences of such an expectancy bias can be quite serious 
pectancy effects on subjects’ responses are not public matters It is 
only that other scientists cannot know whether such effects 
m the experimenter’s interaction with his subjects, the , 

self may not know whether these effects have occurred Moreover, 

IS the likelihood that the experimenter has not even considere e P° 
bility of such unintended effects on his subjects responses is 
so different from the situations already discussed wherem the ^ 
response is affected by any attnbute of the experimenter 
problem will be discussed in more detail For now it is enoug o 
that while the other attributes of the expenmenter affect me su J 
response, they do not necessarily affect these responses 
as a function of the subject’s treatment condition Expectancy e ' 
on the other hand, always do The sex of the experimenter 
change as a function of the subject’s treatment condition in an P ^ 
ment The experimenter’s expectancy of how the subject wi 
does change as a function of the subjects treatment condition j ^ 

Aitliough the focus of this chapter is primarily on the e cc 
partiaihr person — the experimenter — on the behavior of a 
other — the subject — it should be emphasized that many of t e 
of the expenmenter, mcluding the effects of his expectancy, ma) 
considerable generality for other social relationships , i i, or may 
That one person's expectation about another persons be 
contnbute to a determination of what that behavior wi j 

has been suggested by vanous theorists Merton (1948) eve P , 
very appropriate concept of “self fulfilling prophecy 
an c\ent and the expectation of the event then changes e 
of the prophet m such a way as to make the prophesie 
hkcly Gordon Allport ( 1950) applied the concept of to 

pcctancics to an analysis of the causes of \xar Nations «p 
go to war affect the behavior of their opponents to-be by exp^ 
which reflects tbcir expectations of armed conflict cntcno2 

to remain out of wars at least sometimes manage to av 
into them , licaliog 

Drawn from the general hlerature, and the u con«^ 

professions, survey research, and laboratory ps)choIogy, 
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In summanzing his difiBculties m learning the nature of Clever Hans’ 
talents, Pfungst felt that he had been too long off the track by “looking 
for, in the horse, what should have been sought in the man ” Perhaps, 
too, when we conduct research in the behavioral sciences we are some- 
times caught looking at our subjects when we ought to be looking at 
ourselves It was to this possibility that much of the research to be 
reviewed here was addressed 

A. Animal Learning 

A good beginning might have been to replicate Pfungst’s research, 
but with horses hard to come by, rats were made to do (Rosenthal 
and Fode, 1963a) 

A class in experimental psychology had been performing experiments 
With human subjects for most of a semester Now they were asked to 
perform one more experiment, the last m the course, and the first employ- 
ing animal subjects The expenmenters were told of studies that had 
show that maze-bnghtness and maze-dullness could be developed m 
strains of rats by successive mbreeding of the well- and the poorly-per- 
forming maze runners Sixty laboratory rats were equitably divided 
among the 12 expenmenters Half the expenmenters were told that their 
rats were maze-bright while the other half were told that their rats 
were maze dull The animal’s task was to learn to run to the darker 
of two arms of an elevated T-maze The two arms of the maze, one 
white and one gray, were interchangeable, and the “correct” or rewarded 
arm was equally often on the nght as on the left Whenever an animal 
ran to the correct side he oblamed a food reward Each rat was given 
10 tnals each day for five days to learn that the darker side of the 
maze was the one which led to the food 

Begmnmg with the first day and continuing on through the experi- 
ment, animals believed to be better performers became better per- 
formers Animals believed to be brighter sho^\ed a daily improvement 
m their performance, while those believed to be dull improved only 
to the third day and then showed a worsening of performance Some- 
times an animal refused fo budge from Ins starting position This hap- 
pened of the time among the allegedly bright rats, but among al- 
legedly dull rats it happened 29% of the time When animals did respond 
correctly, those believed to be brighter ran faster to the rewarded side 
of the maze than did even the correctly responding rats believed to 
be dull {z = -f2 05). 

^Vhen the expenment was over, all expenmenters made ratings of 
thcir rats and of their own attitudes and behavior vis-i-vas their animals 
Those expenmenters who had been led to expect better performance 
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This added to the reputation of Hans as brilliant That is, when a large 
number of taps was the correct response, Hans would tap rapidly un 
he approached the region of correctness, and then he would begin to 
slow down It was found that questioners typically bent forward more 
when the answer was a long one, gradually straightening up as Hans 
got closer to the correct number 

For some experiments, Pfungst discovered that auditory cues unc 
tioned additively with visual cues When the experimenter was 
Hans was able to respond correctly 31 per cent of the time 
one of many placards with different words written on it, or c ot s o 
different colors When auditory cues were added, Hans responde cor 
rectly 56 per cent of the time 

Pfungst himself then played the part of Hans, tapping out response 
to questions with his hand Of 25 questioners, 23 unwittingly cue 
Pfungst as to when to stop tapping m order to give a correct response 
None of the questioners (men and women of all ages and occupations 
knew the intent of the expenment When errors occurred, they '''ere 
usually only a single tap from being correct The subjects of this stu 
including an experienced psychologist, were unable to discover that t cy 
were unintentionally emitting cues g 

Hans’ amazing talents, talents rapidly acquired too by Pfungst, se 
to illustrate the power of the self-fulfilling prophecy Hans’ 
even skeptical ones, expected Hans to give the correct answers to 
queries Their expectation was reflected m their unwitting signal to 
that the time had come for him to end his tapping The signa c 
Hans to stop, and the questioner’s expectation became the reason 
Hans’ being once again, correct 

Not all of Hans’ questioners were equally good at fulfil ing 
prophecies Even when the subject is a horse, apparently, the a ri 
of the experimenter make a considerable difference in 5 uni 

subject’s response On the basis of his studies, Pfungst was a e ® 
manze the characteristics of those of Hans’ questioners 
successful in their covert and unwitting communication wit t e 
Among the characteristics of the more successful 
encers were those of tact, an air of dominance, attention to the 
at hand, and a facility for motor discharge Pfungst s jnore 

60 years ago seem not to have suffered excessively for the re 

modem methods of scaling observations To anticipate some ° 
search findings to be presented later, it must be said that ^ 
description seems also to fit those experimenters who are 
to affect their human subject’s response by virtue of their expe 
hypothesis 
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their animals to be Skinner box bright handled them relatively more, 
or said they did, than did experimenters believing their animals to be 
dull The extra handling of animals believed to be bnghter may have 
contributed in both expenments to the superior learning shown by these 
animals 

In addition to the differences in Iiandhng reported by the expen- 
menters of the Skinner box study as a function of their beliefs about 
their subjects, there were differences in the reported intcntness of their 
observation of their animals Animals believed to be brighter were 
watched more carefully, and more careful observation of the rats Skinner 
box behavior may very well have led to more rapid and appropnate 
reinforcement of the desired response Tims, closer observation, perhaps 
due to the belief that there would be more promising responses to be 
seen, may have made more effective teachers of the experimenters ex- 
pecting good performance 

Cordaro and Ison (1963) employed 17 expenmenters to conduct con 
ditionmg expenments with 34 planana Five of the expenmenters were 
led to expect that their worms (two apiece) had already been taught 
to make many turning and contracting responses Five of the expen 
menters were led to expect that their worms (also two apiece) had 
not yet been taught to make many responses and that in only 100 
trials” little turning and contracting could be expected The seven experi- 
menters of the third group were each given both these opposite expec 
fancies, one for each of the their two worms Behavior of the worms 
was observed by the experimenters lookmg down into a narrow (H^O 
and shallow (^") v shaped trough into which each worm was placed 

The results of the Cordaro and Ison experiment are easily summarized 
Regardless of whether the experimenter prophesied the same results 
for both his worms or prophesied opposite results for his two worms, 
when the experimenter expected more turning and contracting he ob- 
tained more turning and contracting (2 > -j-3 25) Similar results in 
studies of planana have been obtained in two studies reported by Hartry 
(1966) and in studies of rats by Ingraham and Harrington (1966), 
and Burnham (1966) to whose study we shall later return From the 
results of these studies we cannot be sure that the behavior of the 
animal was actually affected by the expectation of the expenmenter, 
diough that possibility cannot be ruled out It is also possible, however, 
that only the experimenters perception of ffie animals behavior was 
affected by his hypothesis That view of the results as examples of ob 
server effects is quite plausible m the case of the planana studies Those 
worms, after all, are hard to see That same view, however, is far less 
plausible in the case of the rat studies It is difficult, for example, to 
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Viewed their animals as brighter, more pleasant, and more likeable 
These same experimenters felt more relaxed in their contacts with the 
animals and described their behavior toward them as more pleasant, 
friendly, enthusiastic and less talkative They also stated that they an 
died their rats more often and also more gently than did the experi 
menters expecting poor performance 
The next experiment to be described also employed rat subjects, u^g 
this time not mazes but Skinner boxes (Rosenthal and Lawson, 1 ) 

Because the experimenters (39) outnumbered the subjects (14), 
menters worked m teams of two or three Once again about hal t e 
experimenters were led to believe that their subjects had been speaa y 
bred for excellence of performance The expenmenters who had een 
assigned the remaining rats were led to believe that their animals were 
genetically inferior 

The learning required of the animals in this experiment was mwe 
complex than that required m the maze learning study This time 
rats had to learn in sequence and over a penod of a full aca em 
quarter the following behaviors to run to the food dispenser wheneve 
a clicking sound occurred, to press a bar for a food reward, to ca 
that the feeder could be turned off and that sometimes it did not pay 
to press the bar, to learn new responses with only the choking 
as a reinforcer, (rather than the food), to bar press only m the 
of a hght and not m the absence of the light, and, finally, to pu 
a loop which was followed by a light which informed the anima 
a bar press would be followed by a bit of food lleped 

At the end of the experiment the performance of the anima s a 
to be superior was, in fact, superior to that of the allegedly 
animals (z = -\-2 17) and the difference in learning favored the a 
brighter rats in all five of the laboratory sections m which the expor 
was conducted - 

Just as m the maze learning experiment, the 
present study were asked to rate Iheir animals and their own atn 
and behaviors toward them Once again those experimented 
expected excellence of performance judged their animals to ® yp be* 
more pleasant, and more likeable They also described their on 
havior as more pleasant friendly, enthusiastic, and less talka 
they felt that they tended to watch their animals more closely, , jp 
them more, and to talk to them less One wonders what 
the animals by those experimenters who beheved their rats to o 
The absolute amount of handling of animals in this Skinner o 
ment was considerably less than the handhng of animals m 
learning experiment Nonetheless, those experimenters who 
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as a function of his second result? If replication yields B > A at p < 05 
he can conclude that A is not always larger than B, that A is too often 
too different from B or, and only in this case is he likely to err, that 
on the average, A = B For the moment putting aside considerations 
of statistical power differences between replications, it would seem that 
considerable information could be conveyed by just the direction of 
difference in the two studies and the associated p values These p values 
can be handily traded in for standard normal deviates (z) and they, 
in turn, can be added, subtracted, multiplied, and divided An alge* 
braically signed normal deviate gives the direction and likelihood of 
a difference, while an unsigned normal deviate gives the nondirectional 
likelihood If we have 10 expenments showing A > B at p = 05 and 
10 expenments showing B > A at p = 05, then the average directional 
2 is zero but the average nondirectional z is large enough so that we 
would be rash to conclude that A = B Instead we would probably 
want to conclude that A and B differ too often, but unpredictably, and 
the research task might then be to reduce this unpredictability 

There is another important advantage to translating the results of 
funs of experiments to the standard normal deviate equivalents of the 
p values obtamed That advantage accrues from the fact that the sum 
of a set of standard normal deviates when divided by the square root 
of the number of zs, yields an overall z that tells the overall likelihood 
of the obtamed results considered as a set (Mosteller and Bush, 1954) 

In the summary of what is now known about the effects on research 
results of the experimenter’s expectancy, we shall want to make use 
of these helpful characteristics of the standard normal deviate But our 
purpose Will not be solely to summarize the results of what we know 
about expectancy effects An additional purpose is to employ these data 
as an illustration of how we may deal in a more global, overall way 
With the results of runs of expenments which, despite their differences 
in sampling, m procedure, and in outcome, are all addressed essentially 
to the same hypothesis or proposition There may be sufficient usefulness 
to the method to warrant its more widespread adoption by those under- 
taking a comprehensive review of a given segment of the literature 
of the behavioral sciences 

With the increased interest in the effect of the experimenter’s expec- 
tancy there have been increasing numbers of literature summanes 
(Rosenthal, 1963, 1964a, 1966, 196Sa, Barber and Silver, 1968) These 
summaries have been generally cumulative, but none have been suffi- 
ciently systematic Even the most recent ones have considered less than 

^If the available experimental evidence 

Altogether, well over a hundred studies are known to have been con- 
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confuse a rat’s going out on the right arm of the maze with his going 
out on the left !rm Everything we know about the effects of handh g 
on performance, the effects of set on reaction time (which ™ 

an experimenter expecting bright Skinner box performance a faster ana 
better “shaper ), and the base rate for recording errors, sugge 
It IS more plausible to think that the rat’s behavior was affected tnan 
that the experimenters saw so badly or lied so much 

But what about those worms? Sorely an experimenter <=annot attec 
a worm in a trough to behave differently as a function of his expecta 
Perhaps not, but perhaps so Ray Mnlry (1966) has “u* “ 

a personal communication how when the control of an uncon , 
stimulus to the worm is not automatic, the experimenter may 
teach worms differentially by his application of the uncon itione 
lus Even in fully automated set-ups, however, we cannot ye . 

the possibility that worms can be affected differential y ^ ^ 
observing experimenter Stanley Ratner (1966) has suggeste 
sonal communication that changes in the respiration or even e P 
of the experimenter might (or might not) affect the closel) 

Relative to the small worm, in a small amount of water, 
watching experimenter presents a potentially large source ° 
physical stimuli The hypothesis of worm sensitivity to exp 
respiration changes is especially interesting m view or ear le 
on dogs suggesting that they were substantially mfluence y 
in their trainers’ respiration (Rosenthal, 1965) , ^ effect 

We have now described a number of experiments in w jgani 

of experimenter expectancy were investigated in studies o an 
mg We have given only enough details of a few ot e 
show the type of research conducted, but it would be ^/^"^^g^perimenls 
some systematic way to summarize the results of all ® develop 
conducted, including those only bnefly mentioned Our nee 
some systematic way to summanze runs of studies is S gl^gll 

shortly we turn to a consideration of human subjects, or 
have over 80 studies to consider 


m. APPRAISAL OF A RESEARCH DOMAIN 

V,” and echo 

We all join m the clanon calls for “more researc 3 soro 

sentiment that this or that research is in (1 some, jyre 

4 dire) need of replication But sometimes we 
of what to do with the replication when we have it e * jeplicati®” 
X finds A > B at p < 05 What shall be his conclusion 
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Or less than — 1.28 with zs falling between those values entered simply 
as .00. The net effect of this procedure, as we shall see, was to make 
our overall assessments somewhat too conservative or Type II Error- 
prone, but advantages of simplicity and clarity seemed to outweigh 
this disadvantage. The sign of ^e z value, of course, was positive when 
the difference between groups was in the predicted direction and nega- 
tive when the difference was in the unpredicted direction. 

A special problem occurred, however, for those studies in which the 
basic paradigm was extended to include additional experimental or con- 
trol group conditions. Thus, there were some studies in which a control 
group was included whose experimenters had been given no expectations 
for their subjects’ responses. In those cases the z value associated with 
the overall test of expectancy effects has a meaning somewhat different 
from the situation in which there are only two experimental groups. 
A large nondirectional z means that the experimenter’s expectancy made 
some difference in subjects’ responses but it is not so easy to prefix 
the algebraic sign of the z. It often happened that the control group 
differed more from the two experimental groups than the experimental 
groups differed from each other. Essentially, then, considering all avail- 
able studies there were two hypotheses being tested rather than just 
one. The first hypothesis is that experimenters’ expectations significantly 
affect their subjects’ responses in some way and it is tested by consider- 
ing the absolute magnitude of the zs obtained. The second hypothesis 
is that experimenters’ expectations affected their subjects’ responses in 
such a way as to lead to too many responses in the direction of the 
experimenter’s expectation. This hypothesis is tested by considering the 
algebraic magnitude of the zs obtained. For those studies in which the 
overall z was only a test of the first hypothesis, an additional directional 
2 was also computed addressed to the question of the degree to which 
the experimental manipulation of expectations led to hypothesis-confirm- 
ing responses. 

One more difficult decision was to be made. That had to do with 
determining the number of studies to be counted for each paper. Many 
papers described more than one experiment but sometimes investigators 
regarded these as several studies \vith no overall test of significance 
and sometimes investigators pooled the data from several studies and 
tested the overall significance. The guiding principle employed in an 
earlier summary (Rosenthal, 1968a) was to count as more than one 
experiment only those within a given paper that employed both a differ- 
ent sample of subjects and a substantial difference in procedure. For 
this more comprehensive review it was felt to be more informative to 
count as a separate experiment those that employed either a different 
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ducted, all based on independent samples and all addressed to 
tral proposition that interpersonal expectancies m the researc 
may be a significant source of artifact in behavioral research 
80 of these, more or less formal reports are available These repo* 
inelude pubhshed papers, papers presented at professional S 

doctoral dissertations, masters theses, honors theses, and some p 
hshed manuscripts The remaining studies, m vanous stages ot co^P 
tion or availability posed a problem, should they be me u e ° , 

It was decided that any research, even if lacking a formal repor , 
be mcluded if (a) at least an informal descnption of the proce 
were available and (b) the raw data were available 
additional dozen studies met this criterion and are listed m e re . 
as “unpublished data ” Of the remaining studies sever w s 
become available to the writer and two reports have ^en see 
not yet used Both of these reports presented data msu cien 
computation or even reasonable estimation of a z value ne o 
reports claimed effects of experimenter expectancy and one c at 


absence of such effects 

For most of the studies summanzed, no exact p was 8^®"/ 
p was less than or greater than some arbitrary value In 
exact ps were computed for our present purpose Sometimes 
more than a single overall test of the same hypothesis o 
effect and in those cases median ps are presented In a ^ pg 5 
orthogonal overall tests were made of the effects of two ^ retain®^ 
of expectancies and m those cases only the more extreme z correction 
after a correction was made for having made two tests ^ ^ 
involved doubling the p before finding the corrected p 

The same correction was employed in those cases where^ ^ ^ 

was computed and it was stated that another test yie 


but without that p being given j,as be®n 

The basic paradigm in the experiments to be summar^^ 
to establish two groups of experimenter and to studies fol 

a different expectation for their subjects’ responses For ^ compo^® 
lowing this paradigm it was a straightforward proce ^ value 

the overall exact p along with its corresponding divide th® 

For the purpose of overall assessment it was convenien^ those falbnS 
obtained zs into three groups those of +1 28 or grea . 
between ± 1 28, and those of — 1 28 or smaller We expe to 

of the results to fall m the first group, 80 per cent o ^ the 

fall in the second group, and 10 per cent of the resu 
third group under the hypothesis of no expectancy e d*! ^ 

of simplicity, exact zs were recorded only for zs gr 
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TABLE III 

Expectancy Effects in Studies of Animal Learning 



Study 


Standard norr 

nal deviate 

Code 

number 

Authors 


Nondirectional 

Directional 

1. 

Burnham 1960 

I 

1 50 

+ 1 95 

2. 

Cordaro and Ison, 1963 

I 

3 96 

-f 3 96 

3. 

Cordaro and Ison, 1963 

II 

3 25 

+ 3 25 

4. 

Hartry, 1966 

I 

5 38 

-I- 5 38 

5. 

Hartry, 1966 

II 

3 29 

+ 3 29^ 

6. » 

Ingraham and Harrington, 1966 

I 

1 48 

-f 1 48 

7. “ 

Ingraham and Harrington, 1966 

11 

2 10 

-j- 2 10 

8. 

Rosenthal and Fode, 1963a 

I 

2 33 

+ 2 33 

9. 

Rosenthal and Lawson, 1964 

I 

2 17 

4- 2 17 



Sum 

25 46 

+25 91 



V9 

3 

3 



z 

8 49 

+ 8 64 



P < 

l/(milhon)* 

l/(milIion)* 


“ See also Rosenthal, 1967b, 1967c. 

* Not based on exact p, exact z probably exceeds 5 or 6 
one group of experimenters was given one expectation while the other 
group of experimenters was given the opposite expectation. In both 
these papers, the second experiment was defined as that in which another 
group of experimenters held positive expectations for one group of their 
subjects and negative expectations for another group of their subjects. 
To put it another way, when experimenter expectancy was a between- 
experimenter source of variation that was regarded as an experiment 
different from that in which experimenter expectancy was a within*ex- 
perimenter source of variation. Finally, the experiments reported by 
Hartry were simply two independent experiments conducted at different 
times by different experimenters and with differences in procedures. 
Interestingly, it was the second study with its tighter controls for inex- 
perience of the experimenters, for observer errors, and for intentional 
errors, that showed the greater magnitude of expectancy effect with 
planaria subjects. 

For each of the studies in Table III, two s values are given. The 
first is the z associated \vith an overall lest of the hypothesis that the 
groups of experimenters employed showed some difference. The second 
Is the z associated with the specific test that experimenters given one 
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sample of subjects or a substantial difference in procedure for some 
of their subjects It was often difficult to decide when some procedural 
difference was substantial and quite unavoidably this had to remain 
a matter of the writer s judgment There is no doubt that other wor ers 
might have classified some procedures as substantially different a 
were here regarded as essentially similar, and that some studies trea e 
separately here would have been regarded by others as essentia y t ^ 
same The major protection against serious errors of inference ue o 
this matter of judgment comes from subsequent analyses that consi er 
all research done by a given pnncipal investigator or at a given a ora 


tory as a single result . . 

Judgment and, therefore, possible error also entered into the 
tion of each of the many z values Methods of dealing with mu 
p values have been referred to, but sometimes (eg, when no ov 
p had been computed) it was necessary to decide on the most a^r 
pnate overall test Here, too, it seems certain that, for any given s )> 
different workers might have chosen different tests as most 
Because of the large amounts of raw data analyzed by the gj 

because of the many secondary analyses performed when the 
was felt to be inappropriate, a rule of thumb aimed primarily a 
goal of convenience was developed Given a choice of several m 
less equally defensible procedures (eg multiple regression, an 
covanance, treatments by levels analysis of variance) the most ^ 
procedure was selected (eg treatments by levels) with the 
of simplicity geared to the use of a desk calculator This rule o 
probably had the effect of decreasing more zs than it increase ’ 
in general, the more elegant procedures use more of the m o 
m the data and generally lead to a reduction of Type H 
IS likely to be especially true when the distribution of obtaine 
IS as radically skewed as the one obtained , of 

Despite these sources of errors of conservatism, the possi 
biased judgment and sheer error on the part of the \vnter 
should not be ruled out As protection against the possibi i ^ 
biases we shall later want to make some very stringent co^ jjyjics 
Since we have already summanzed partially the resu ts 
of expectancy effect employing animal subjects, that seems ® 
domain with which to illustrate our systematic summanzation p 
Table III lists nine studies testing the hypothesis of 
Burnham conducted a single study by our entenon, but t e 
sets of authors conducted two apiece For Cordaro an , ^ 
Ingraham and Hamngton one experiment was defined as tna 
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TABLE III 

Expectancy Effects in Studies of Animal Learning 


Study 


Standard normal deviate 


Code 

number Authors Nondirectional Directional 


1. Burnham 1966 

2. Cordaro and Ison, 1963 

3. Cordaro and Ison, 1963 

4. Hartry, 1966 

5. Hartry, 1966 

6 “ Ingraham and Harrington, 1966 

7. ® Ingraham and Harrington, 1966 

8. Rosenthal and Fode, 1963a 

9. Rosenthal and Lawson, 1964 


I 

1 50 

+ 1 95 

I 

3 96 

+ 3 96 

II 

3 25 

+ 3 25 

I 

5 38 

+ 5 38 

II 

3 29 

+ 3 29» 

I 

1 48 

+ 1 48 

II 

2 10 

+ 2 10 

I 

2 33 

+ 2 33 

I 

2 17 

+ 2 17 

Sum 

25 46 

+25 91 

V9 

3 

3 

z 

8 49 

+ 8 64 

V < 

l/imilhon)* 

l/(milIion)' 


“ See also Rosenthal, 1967b, 1967c. 

^ Not based on exact p; exact 2 probably exceeds 5 or 6. 
one group of experimenters was given one expectation while the other 
group of experimenters was given the opposite expectation. In both 
these papers, the second experiment was deEned as that in which another 
group of experimenters held positive expectations for one group of their 
subjects and negative expectations for another group of their subjects. 
To put it another way, when experimenter expectancy was a beriveen- 
experimenter source of variation that was regarded as an experiment 
different from that in which experimenter expectancy was a within-ex- 
perimenter source of variation. Finally, the experiments reported by 
Hartry were simply t^vo independent experiments conducted at different 
times by different experimenters and with differences in procedures. 
Interestingly, it was the second study with its tighter controls for ine.x- 
porience of the experimenters, for obscrx'cr errors, and for intentional 
errors, that showed tlic greater magnitude of expectancy effect with 
planaria subjects. 

For each of the studies in Tabic III, two s values arc given. The 
first is the = associated with an overall lest of the h^-pothesis that the 
Eroiips of c-xpcrimcntcrs employed showed some difference. Tlie second 
Is tile z associated with the specific lest that experimenters given one 
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of .ub,ecfs o. o whC 

of thcr sub,ects It was often difficult to deade wnen 

difference uas substantial and quite other workers 

a matter of the unter’s ,udgnient. ^ere - - ^nt that 

might have classified some procedures as treated 

were here regarded as essenUally similar, and ^ essentially the 
separately here would have been regarded by inference due to 

same TlL maior protection agamst serious f onsid^^ 

this matter oE judgment comes from subsequen y 
all research done by a given principal investigator or at g 
lory as a single result , palcula* 

Judgment and. therefore, possible error also entere multiple 

Jn of each of the many z values Methods « "3 
p \alucs have been referred to, but sometimes (e.g , 

,1 had boon computed) it was necessary to decide on e 
nriate overall test Here, too, it seems certain that, t y ° 
different workers might have chosen different tests ^nd 

Because of tlic large amounts of raw data analyzed by 
because of the many secondary analyses performed w en 
was felt to be inappropriate, a rule of thumb aimed P””’ , rnore*or- 
goal of convenience was developed. Given a choice or seve . 
less equally defensible procedures (e.g. multiple sitnpl® 

covariance, treatments by levels analysis of variance) t e g^tenon 
procedure was selected (eg treatments by levels) wi jg thuinlj 
of simplicity geared to the use of a desk calculator. This ru 
probably had the effect of decreasing more zs than it incr 
in general, the more elegant procedures use more of t e .j^is 

in the data and generally lead to a reduction of "^XP® . , « values 

IS hkclv to be esncciallv true when the distribution of o £U^ 


ssibildy 0* 

cannot and 


IS as radically skewed as the one obtained. • 

Despite these sources of errors of conservatism, ^ P ^mnot 
biased judgment and sheer error on 

should not be ruled out. As protection against the pos j^rrecli®^ 
biases we shall later want to make some very q{ slu^“^ 

Since we have already summarized partially the ^ pQjl sub- 
of expectancy effect employing animal subjects, that proce^^^^ 

domain w ith w’hich to illustrate our s)^tematic summariza i effod^ 

Tabic III lists nine studies testing the hypothesis of pexl 

Burnham conducted a single study by our cntOTon u 
sets of authors conducted two apiece. For Cordaro whicb 

Ingraham and Harnngton one experiment was define 
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IV. HUMAN SUBJECTS 

So far we have given only the results of studies of expectancy effect 
in which the subjects were rats or worms. Most of tlie research available, 
however, is based on human subjects and it is those results that we 
now consider. In this set of expenments at least 20 different specific 
tasks have been employed, but some of these tasks seemed sufficiently 
related to one another that they could reasonably be regarded as a 
family of tasks or a research area. These areas include human learning 
and ability, psychophysical judgments, reachon time, inkblot tests, struc- 
tured laboratory interviews, and person perception. We consider each 
in turn. 

A. Learning and Ability 

Table V summarizes the results of the per experiment and the per 
principal investigator analysis. There appeared to be no appreeiab e 

TABLE V 

Expectancy Effects in Studies of Learning and Abilttv 


Standard normal deviate 


Code 

number 

Authors 



Nondirectional 

Directional 

1. 

2. 

3. 

Getter, M, H, W, 

Hurwitz and Jenkins, 
Johnson, 

1907 

1960 

1907 

I 

I 

1 

00 

1 28 

3 89 

00 

2 27 

1 GO 

00 
-fl 28 
-1-3 89* 

00 

4. 

5 

Kennedy, E, W, 

Kennedy, C, B, 

1908 

1968 

I 

I 

+2 27- 
+ 1 GO 

G. 

Larrabee and Klcmsasser, 

1967 

I 

00 

00 

00 

00 

7. 

Timaeus and Lflck, 

196Sb 

00 

8. 

Wartenberg-Ekren, 

1902 

I 

00 

9. 

Wesslcr, 

1908b 

I 

SUMMARY 



Sum 

0 01 

+9 04 





3 

3 

hy Study 



p < 

3 01 

0015 

+3 01 
0015 




Sum 

8 3S 

+S 3S 

hy Principal 



2 83 

2 90 

2 83 
+2 90 

Invcaiipator 


p " 

0015 

0015 

• Indicates that experimenter expectancy 

* > /1.2S/. 

inlrractcd nitli nnotlirr 

\nrialile at 
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expectahon obtamed data m the dtreeUon of expeolaho^ 

wLn experimenters were given some other expectatio 

of zs the sum of the zs is indicated, as is the squa e 

of zs and the new z obtained when the former is divided by ^ 

Finally, the v associated with the overall z is give .Hpntical 

for studies involving animal subjects, the ™ j. That 

in absolute value to the directional zs in every case exc p 

was because in only that study was there a expecltions 

groups than simply those reflecting each of two difleren P ^ 

The combined probability of obtaining the overall 2 ^ 

the nondirectional or the directional zs is infinitesima y 

to bring the combined p value for the directional es ’ 00 

239 experiments with an average directional z value o } 

would have to be conducted 


TABLE IV 

Expectancy E^ffects in Studies of Antmai. Learnwc 
BY Principal Investigators 


Principal investigator 


standard normal deviate 


Code number Name 


Nondirectional Directional 


I 

Burnham 

II 

Ison 

III 

Hartry 

IV 

Harrington 

V 

Rosenthal 


Sum 

VF 


1 50 

5 11 

6 15 

2 54 

3 19 
18 49 

2 24 
8 25 

l/(minion)* 


+ 1 95 
+ 5 11 
+ 6 15 
+ 2 54 
+ 3 19 
+18 94 
2 24 
+ 8 46 


ted earlier 


In addition to a “per experiment” appraisal it was me 


that a “per principal investigator" appraisal might 


erroneous inferences Table IV gives the per on 

results and they are found to be very similar to those obtaan^S 


icauita miu uicj aie luuiiu lu ue wiy oii****^" rfatorS x 

ments It would take over 127 new principal coinb^^ 


an average directional z value of exactly 00 to | 

p value to the 05 level The zs given for each earner 

are usually based on the method of combinmg ^ ^-ectancy 
detail In a few cases, however, an overall f diat overall va ^ 
for the several studies was already available ana 
was employed 
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menter expected superior performance the total IQ earned was 7 5 points 
higher on the average than when the child s experimenter expected m 
fenor performance When only the performance subtests of the WISC 
were considered, the advantage to the children of having been expected 
to do well was less than three IQ pomts and could easily have occurred 
by chance When only the verbal subtests of the WISC were considered, 
the advantage of having been expected to do well, however, exceeded 
10 IQ pomts The particular sublest most affected by expenmenlers* 
expectancies was Information The results of this study are especially 
staking m view of the very small sample size ( 12) of subjects employed 

In the experiment by Hurwitz and Jenkins the tasks were not stan 
dardized tests of mtelligence, but rather two standard laboratory tests 
of learning Three male experimenters administered a rote verbal learn 
mg task and a mathematical reasonmg task to a total of 20 female sub 
jects From half their subjects the experimenters were led to expect 
superior performance, from half they were led to expect infenor 
performance 

In the rote learning task, subjects were shown a list of pairs of non 
sense syllables and were asked to remember one of the pair members 
from a presentation of the other pair member Subjects were given six 
tnals to learn the syllable pairs Somewhat greater learning occurred 
on the part of the subjects contacted by the expenmenters believing 
subjects to be brighter although the difference was not large numerically 
and z < -f-1 28, subjects alleged to be bnghter learned 11 per cent more 
syllables The curves of learning of the paired nonsense syllables how- 
ever, did show a difference heUvecn subjects affeged to be bnghter 
and those alleged to be duller Among bnghter” subjects learning in 
creased more monotonically over the course of the six tnals than was 
the case for “duller” subjects (The coclBcient of determination behvcen 
accurate recall and tnal number was 50 for the “bnght” subjects and 
23 for the “dull” subjects, s>200 but not used in assessing overall 
significance ) 

In the mathematical reasoning task, subjects had to learn to use three 
sizes of water jars m order to obtain exactly some specified amount 
of water On the cntical tnals tlic correct solution could be obtained 
b) a longer and more routine procedure which was scored for partial 
credit or bj a shorter but more novel procedure which was given full 
credit Those subjects whose expenmenters expected supenor per- 
formance earned higher scores than did those subjects whose expen 
mentors expected mfenor performance Among the latter subjects onl) 
40 per cent ever achieved a novel solution, while among the allegedly 
supenor subjects 8S per cent achieved one or more novel solutions 
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effects of the experimenters expectancy on subjects’ performance of (a) 
the Wechsler Adult Intelligence Scale (Getter et al,), (b) the Block 
Design subtest of the same Scale (Wartenberg-Ekren) (c) a color- 
recognition task (Timaeus and Luck), and (d) a dot-tapping task 
(Wessler) Two of the experiments (Kennedy et al ) employed a verbal 
conditioning task and in the second of these studies those experimenters 
expecting greater “conditioning” obtained greater conditioning than did 
the experimenters expecting less conditioning In this experiment, as 
indicated by the asterisk of Table V, as in many others, there was 
also an interaction effect (defined as z > -j-l 28 or z < — 1 28) between 
experimenter expectancy and some other variable In this study the inter- 
action took the form that those experimenters of a more “humanistic 
or optimistic disposition obtained greater biasing effects of expecta- 
tions than did experimenters of a more “deterministic” or pessimistic 
disposition 

The earlier experiment by Kennedy et al also varied expectabons 
about subjects’ conditioning scores but half the time the experimenters 
had visual contact with their subjects (le, sat across from them) and 
half the time they had no visual contact with subjects (le, sat behind 
them) Because the same expenmenters were employed in both condi- 
tions we count this as only one experiment and the overall test of signifi- 
cance showed the directional z < -fl 28 and no interaction at z = /1 28/ 
between expectancy and visual contact It was nevertheless of interest 
to note that in the face to face condition, expenmenters expectmg greater 
conditioning obtained greater conditioning (z = -|-195) than did the 
expenmenters expecting less conditionmg In the condition of no visual 
contact the analogous z was very close to zero These results, though 
summanzed as a “failure to replicate,” do suggest that visual cues may 
be important to the communication of experimenter expectations to the 
subjects of a verbal conditioning experiment as they were important 
to the communication of expectations to Clever Hans The second experi- 
ment in the Kennedy series was conducted with experimenter and sub- 
ject m face-to face contact 

Especially instructive for its unusual within-subject experimental ma- 
nipulation was the study by Larrabee and Kleinsasser They employed 
five experimenters to administer the Wechsler Intelhgence Scale for Chil 
dren (WISC) to 12 sixth graders of average intelligence Each subject 
was tested by two different expenmenters — one admmistermg the even- 
numbered items and the other admmistenng the odd numbered items 
For each subject, one of the expenmenters was told that the child was 
of above average intelhgence while the other expenmenter was told 
that the child was of below average intelligence When the child’s expen 
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menter expected supenor performance the total IQ earned was 7 5 points 
higher on the average than when the child’s experimenter expected in 
fenor performance When only the performance subtests of the WISC 
were considered, the advantage to the children of having been expected 
to do well was less than three IQ points and could easily have occurred 
by chance When only the verbal subtests of the WISC were considered, 
the advantage of having been expected to do well, however, exceeded 
10 IQ points The particular subtest most affected by experimenters’ 
expectancies was Information The results of this study are especially 
striking in view of the very small sample size (12) of subjects employed 

In the experiment by Hurwitz and Jenkins the tasks were not stan- 
dardized tests of intelligence, but rather two standard laboratory tests 
of learning Three male expenmenters administered a rote verbal learn 
mg task and a mathematical reasoning task to a total of 20 female sub 
jects From half their subjects the expenmenters were led to expect 
superior performance, from half they were led to expect infenor 
performance 

In the rote learning task, subjects were shown a hst of pairs of non- 
sense syllables and were asked to remember one of the pair members 
from a presentation of the other pair member Subjects were given six 
tnals to learn the syllable pairs Somewhat greater learning occurred 
on the part of the subjects contacted by the experimenters believmg 
subjects to be brighter although the difference was not large numerically 
and s < -f-1 28, subjects alleged to be bnghter learned 11 per cent more 
syllables The curves of learning of the paired nonsense syllables, how- 
did n diSeiente WtNJeen brighter 

and those alleged to be duller Among brighter subjects, learning in 
creased more monotonically over the course of the six tnals than was 
the case for “duller” subjects (The coefficient of determination between 
accurate recall and tnal number was 50 for the “bnght” subjects and 
25 for the "dull’ subjects, z>200 but not used in assessing overall 
significance ) 

In the mathematical reasoning task, subjects had to learn to use three 
Sizes of water jars m order to obtain exactly some specified amount 
of water On the cntical tnals the correct solution could be obtained 
by a longer and more routine procedure which was scored for partial 
credit or by a shorter but more novel procedure which was given full 
credit Those subjects whose expenmenters expected supenor per- 
formance earned higher scores than did those subjects whose expen 
menters expected mfenor performance Among the latter subjects, only 
40 per cent ever achieved a novel solution, while among the alleged!) 
supenor subjects 88 per cent achieved one or more novel solutions 
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B. Psychophysical Judgments 

Table VI shows the results of nine studies employing tasks we may 
refer to loosely as requiring psychophysical judgments, and Table VII 
shows the per investigator summary. Five of Ae six studies yielding 
directional zs < +1.28 employed a number estimation task (Adair; 
Muller Timaeus; Shames and Adair, I, II; Weiss). 

Adair, though he found no main effect of experimenter expectancy, 
did find that the magnitude of expectancy effect could be predicted 
from a knowledge of the sex of experimenter and sex of subject. Greater 
expectancy effects were found when experimenter and subject were of 
the opposite rather than the same sex (z = 233). Muller and Timaeus 
found that the effect of experimenter expectation was to decrease the 
variability of obtained responses relative to a control group, while Weiss 
found that relative to the control subjects, subjects whose experimenters 
had been given any expectation underestimated the number of dots 
presented. Shames and Adair (I) found that those experimenters who 
were judged by their subjects as more courteous, more pleasant, and 
more given to the use of head gestures showed a tendency (all 
2 S > 1,96) to obtain data opposite to that which they had been led 
to expect. 

The experiments by Horst and by Wessler both employed a line length 


TABLE VI 

Expectancx Effects in Stupies of PsYCHOFirvsicAi. Judgments 



Study 



Standard normal deviate 

Code number 

Authors 



Nondirectional 

Directional 

1, 

Adair, 

1968 

I 

00 

OO- 

2. 

Horst, 

1966 

I 

1 74 

+1 94 

3. 

Muller and Timaeus, 1967 

I 

1 88 

00“ 

4. 

Shames and Adair, 

1967 

I 

00 

00- 

5 

Shames and Adair, 

1967 

II 

00 

00 

6 

Weiss, 

1967 

I 

1 39 

00* 

7. 

Wessler, 

1968b 

I 

00 

00 

8 

Zoble, 

1968 

I 

3 29 

+3 70 

9. 

Zoble, 

1968 

II 

2 02 

+2 02 




Sum 

10 32 

+7 CO 




Vo 

3 

3 




z 

3 44 

+2.5.5 




p < 

0003 

006 


•Indicates that experimenter expectancy interacted with another variable at 
z > /I 28/. 
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Subiects expected to be duU made 57 per cent again as many errors 
as did subjects expected to be bnght In this experiment, with two 
tasks performed by each subject, the overall 2 was based on subjects 
performance on both tasks 

The final experiment to be mentioned m this section is of special 
importance because of the elimination of plausible alternatives to the 
hypothesis that it is the subjects response that is affected by the experi- 
menter s expectancy In his experiment, Johnson employed the Stevenson 
marble dropping task Each of the 20 experimenters was led to believe 
that marble dropping rate was related to intelligence More intelligent 
subjects were alleged to show a greater increase in rate of marble drop- 
ping over the course of six trials Each experimenter then contacted 
eight subjects half of whom were alleged to be brighter than the remain- 
ing subjects 

The recording of the subjects response was by means of an electric 
counter, and the counter was read by the investigator who was blind 
to the subjects expectancy condition As can be seen from Table V, 
the results of this study, one of the best controlled in this area, were 
the most dramatic Experimenters expecting a greater increase m marble- 
dropping rate obtained a greater increase than they did when expecting 
a lesser mcrease In this study, too, there was an interaction effect be- 
tween the expectation of the experimenter, the sex of the experimenter, 
and the sex of the subject Same sex dyads showed a greater effect 
of experimenter expectation (z = 180) 

Considenng the studies of human learning and ability as a set, it 
appears that the effects of experimenter expectancy may well operate 
as unintended determinants of subjects performance The magnitudes 
of the effects obtained however, are considerably smaller than those 
obtained with animal subjects It would take only another 21 experiments 
with an average directional z of 00 to bring the overall p to the 05 
level compared to the 239 experiments required in the area of animal 
learning Another 18 principal investigators averaging zero z results 
would bring the combined p to 05 in the area of human learning and 
abilities 

Because there are few experiments in this set employing exactly the 
same task it is difficult to be sure of any pattern in the magnitudes 
of zs obtained m individual studies or by individual investigators Per 
haps it does appear, however, that standardized intelligence tests em 
ployed with adults are relatively not so susceptible to the effects of 
the expenmenters expectancy The Hurwitz and Jenkins results, how 
ever, weaken that conclusion somewhat With only one study each for 
color recognition and dot tapping perhaps any conclusion would be 
premature 
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B. Psychophysical Judgments 

Table VI shows the results of nine studies employing tasks we may 
refer to loosely as requiring psychophysical judgments, and Table VII 
shows the per investigator summary. Five of the six studies yielding 
directional zs < -f-1.28 employed a number estimation task (Adair; 
Muller & Timaeus; Shames and Adair, I, II; Weiss). 

Adair, though he found no main effect of experimenter expectancy, 
did find that the magnitude of expectancy effect could be predicted 
from a knowledge of the sex of experimenter and sex of subject. Greater 
expectancy effects were found when experimenter and subject were of 
the opposite rather than the same sex (z = 233). Muller and Timaeus 
found that the effect of experimenter expectation was to decrease the 
variability of obtained responses relative to a control group, while Weiss 
found that relative to the control subjects, subjects whose experimenters 
had been given any expectation underestimated the number of dots 
presented. Shames and Adair (I) found that those experimenters who 
were judged by their subjects as more courteous, more pleasant, and 
more given to the use of head gestures showed a tendency (all 
zs > 1.96) to obtain data opposite to that which they had been led 
to expect. 

The experiments by Horst and by Wessler both employed a line length 
TABLE VI 


Expectakcv Effects in Studies of Psychophysical Judcments 



Study 



Standard normal deviate 

Code number 

Authors 



Nondirectional 

Directional 

1. 

Adair, 

1968 

I 

00 

00« 

2. 

Horst, 

1966 

I 

I 74 

+ I 94 

3 

Muller and Timaeus, 1967 

I 

1 88 

00» 

4 

Shames and Adair, 

1967 

I 

00 

00- 

5. 

Shames and Adair, 

1967 

II 

00 

00 

6. 

Weiss, 

1967 

I 

1 39 

00» 

7. 

Wessler, 

1968b 

I 

00 

00 

8. 

Zoblc, 

1968 

I 

3 29 

+3 70 

9 

Zoble, 

1968 

11 

2 02 

+2 02 




Sum 

10 32 

+7 66 




V5 

3 

3 




z 

3 44 

•i-2 55 




p ^ 

0003 

006 


•Indicates that experimenter expectancy interacted %\ith another \ariablc at 
e > /I.28/. 
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Subjects expected to be dull made 57 per cent again as many errors 
as did subjects expected to be bright In this experiment, with two 
tasU performed by each subject, the oxerall % was based on subjects' 
performance on both tasks 

The find experiment to be mentioned in this section is of special 
importance because of the elimination of plausible alternatives to the 
hypothesis that it is the subject’s response tliat is affected by the experi- 
menter’s expectancy In his experiment, Johnson employed the Stevenson 
marble dropping task Each of the 20 experimenters was led to believe 
that marble dropping rate was related to intelligence More intelligent 
subjects were alleged to show a greater increase in rate of marble-drop 
ping over the course of six trials Each experimenter then contacted 
eight subjects, half of whom were alleged to be brighter than the remain- 
ing subjects 

The recording of the subject’s response was by means of an electric 
counter, and the counter was read by the investigator who ivas blind 
to tlic subject’s expectancy condition As can be seen from Table V, 
the results of this study, one of the best controlled m this area, were 
the most dramatic Experimenters expecting a greater increase in marble- 
dropping rate obtained a greater increase than they did when expecting 
a lesser increase In this study, too, there was an interaction effect be- 
tween the expectation of the experimenter, the sex of the experimenter, 
and the sex of the subject Same sex dyads showed a greater effect 
of experimenter expectation (z = 180) 

Considering the studies of human learning and ability as a set, it 
appears that the effects of experimenter expectancy may well operate 
as unintended determinants of subjects’ performance The magnitudes 
of the effects obtained, however, are considerably smaller than those 
obtained with animal subjects It would take only another 21 experiments 
with an average directional z of 00 to bring the overall p to the 05 
Icicl compared to the 239 expenments required in the area of animal 
learning Another 18 principal investigators averaging zero z results 
w ould bring the combined p to 05 m the area of human learning and 
abilities 

Because there arc few experiments in this set employing exactly the 
same task it is difficult to be sure of any pattern m the magnitudes 
of zs obtained m individual studies or by individual investigators Per- 
haps it does appear, however, that standardized intelligence tests cm 
plo)cd with adults are rclativclj not so susceptible to the effects of 
the expenmenters expectancy Tlic Hurwitz and Jenkms results, how- 
ever, weaken that conclusion somewhat With only one study each for 
color recognition and dot-tapping perhaps any conclusion would be 
premature 
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additional investigators with mean .00 findings would bring the cumula- 
tive p to .05. 

C. Reaction Time 

Table VIII shows the results of three studies by three different investi- 
gators in which the dependent variable was one form or another of 
reaction time. Employing visual stimuli, McFall found no effects of ex- 
perimenter expectancy, but Wessler did. Wessler also found that the 
effects of experimenter expectancy were greater on earlier trials and 
that, over time, there was a monotonic decrease of expectancy effect 
{% = 1.65). The remaining experiment, by Silverman, employed verbal 
rather than visual stimuli. 


TABLE VIU 

Expectancy Effects in Studies of Reaction Time 


Study 



Standard normal deviate 

Code number 

Authors 


Nondirectional 

Directional 

1. 

McFall, 

1965 

I 

00 

00 

2, 

Silverman, 

1968 

I 

1 88 

+1 88- 

3. 

Wessler, 

1966, 1968a 

I 

1 46 

-f-1 40- 




Sum 

3 34 

-h3 34 




V3 

1 73 

1 73 




z 

1 93 

+1 03 




V < 

03 

03 


“Indicates that experimenter expectancy interacted with another variable at 
2 > /1.28/, 

Silverman employed 20 students of advanced psychology as experi- 
menters to administer a word association test to 333 students of intro- 
ductory psychology. Half llic experimenters were led to expect that 
some of their subjects would show longer latencies to certain words 
than would their control group subjects. The remaining experimenters 
were given no expectations and scrv'cd as an additional control condition. 
Results showed that latencies did not differ between tlic two baseline 
conditions but that when experimenters expected longer latencies from 
their subjects, they obtained longer Litcncics. 

For some of the experimenters, Silverman found a significant tendency 
to commit scoring errors in the direction of tlieir expectations, hut llierc 
was some evidence to suggest that such scoring errors could not \er\' 
well account for the effects obtained. SiKennan Iiad found an interaction 
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TABLE VII 

Expectancy Effects in Stitoies of Psychophysical Jx^gments 
BY Principal Investigators 


Principal investigator 


Standard normal deviate 


Code number 

Name 


Nondirectional 

Directional 

I 

Adair 


00 

00 

II 

Horst 


1 74 

+ 1 94 

III 

Timaeus 


1 88 

00 

IV 

Weiss 


1 39 

00 

V 

Wessler 


00 

00 

VI 

Zoble 


3 77 

+4 06 



Sum 

8 78 

+6 00 




2 45 

2 45 



z 

3 58 

+2 45 



V < 

0002 

007 


estimation task Data presented by Wessler suggest that the z associated 
With the effect of experimenter expectancy might well be > +1 28 but 
because it could not be determined exactly from the data available, 
and because no effect appeared in two other tasks administered to the 
same subjects, we count the z as 00 Horst, however, found line length 
estimation to be affected by the expeiimentei’s expectancy and more 
so by those experimenters rated by their subjects as more pleasant, 
bolder, and less awkward In addition, Horst found (just as Weiss did) 
that, relalue to the control subjects, subjects whose experimenters had 
been given any expectation showed a greater tendency to underestimate 
The largest effects of interpersonal expectancies were found m the 
studies of tone length discrimination by Zoble In both his studies, which 
differed from each other m the mental sets induced in the subjects, 
he found that the experimenter’s expectancy was a significant determi- 
nant of subjects’ discnminations In addition, while he found that either 
the visual or the auditory channel was probably sufficient to serve as 
mediator of expectancy effects, the data suggested that the visual channel 
was more effective than the auditory channel (z = +1 44) 

On the whole, the area of psychophysical judgment, particularly when 
the judgment is of numcrosity, seems less susceptible to the effects of 
experimenter expectancy than the other areas considered so far The 
number of additional experiments with a mean directional z of 00 re- 
quired to bring the overall p to the 05 level for the area of psyehophj’sf 
cal judgments is only a dozen On a per investigator basis, only seven 
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obtained more human than animal responses The remaining expen- 
menteis w ere gi% en the opposite value, belief, or expectation All experi- 
menters \\ere forcefulh ^\a^led not to coicli their two subjects and 
all admmistrations of the Rorschadi a\ere tape-recorded Results showed 
that experimenters led to pn 2 e animal responses obtamed one-tlurd again 
as high an animal to human response ratio as did the expenmcnters 
led to prize human responses Anal) sis of the tape recordmgs re\ealed 
no evidence fa\onng the h}’pothesis that difFerenbal \erbal reinforce- 
ment of subjects* responses might ha\e accounted for the differences 
obtamed In addibon, none of the subjects reported tliat tlieir expen- 
menter seemed to show an) special interest m any parbcular U’pe of 
response The cues by which an expenmentcr immtenbonall) informs 
his subject of the desired response appear libel) to be subtle ones 
In die expenment b) Marwit and Marcia 36 ad\ anced undergraduate 
experimenters administered fi\e of die Holtzman inbblots to a total of 
53 students of elementary ps)cholog)' Some of the expenmcnters ex- 
pected many responses from their subjects either on the basis of their 
o^vn hypotheses or because that was what the\ had been Jed to expect 
The remainmg experimenters expected few responses from their subjects 
The oierall results showed that e\*penmenters ex'pecbng more responses 
obtamed more responses dian did expenmenters ex-pechng fewer re 
sponses Among the ex'penmenters who had developed their own hy- 
potheses, those who expected more responses obtained 59 per cent more 
responses than did the experimenters who expected fewer responses 
Among the expenmenters who were giien “readymade** expectancies, 
those who expected more obtained 61 per cent more responses dian 
those xvho expected few er responses 

In this expenment almost one third of the expenmenters admitted 
to bemg aware that their o%vn expectancy effects were under mvestiga- 
bon Interesbngly enough, this admitted awareness bore no relationship 
to magnitude of expectancy effect exerted In addibon, there was no 
overall relationship between the number of verbal inquiries made Uy 
expenmenters and the number of responses obtained from their subjects 
(r— 07) However, an interesting reversal of what we might expect 
occurred when it was shown that the subgroup of expenmcnters wlio 
ashed the most quesbons were the experimenters who liad been led 
to expect few responses (2 *= 2 40) Finally, tlicre was an interesting 
tendency, not found in earlier sbidics, for those inbblots shown later 
to manifest greater expectancy effects than those inkblots shown earlier 
If some form of unintended reinforcement were cmplo)ed b) the experi- 
menters, it IS at least unlikcl) to Iia\e been an) thing so obiious ns differ- 
ential quesboning 
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of expenmcnter expectation by sex of expenmenter and sex of subject 
(z = 195) The nature of this interaction was the same as that found 
by Adair \Vhen experimenters and subjects were of the opposite sex 
they showed greater expectancy effects than when they were of the 
same sex Silverman’s plausible reasoning was that scoring errors on 
the part of the experimenters ought not to be differentially related to 
their subjects’ sex as a function of their own sex 
Because of the small number of experiments conducted in this area 
it would take only one additional study (or principal investigator) with 
an associated z of zero to bring the combined p level to 05 Thus 
though two of the three studies obtained zs of +1 28 or greater we 
can not ha\e the confidence in the expectancy hypothesis that seems 
warranted for other research 

D. Inkblot Tests 

Table IX summanzes by study and by principal investigator the results 
of research on expectancy effects employing as the dependent variable 
subjects’ perceptions of inkblot test materials In the first of these studies, 
Mashng employed 14 graduate student experimenters to administer the 
Rorschach to a total of 28 subjects Half the expenmenters were led 
to believe that it would reflect more favorably upon themselves if they 

TABLE IX 

Expectancy Effects in Studies of Inkblot Tests 



Study 


Standard normal deviate 

Code number 

Authors 


Nondirectional 

Directional 

I 

Maniit, 

1968 I 

1 80 


2 

Marwit and Marcia, 

1967 I 

3 25 


3 

Mashng, 1965, 1966 I 

2 05 


4 

Strauss, 

1968 I 

2 32 

00- 

SUMMAIIT 


Sum 

9 42 

+7 10 



Vi 

2 


By Study 


z 

4 71 

+3 55 



V 5 

0000015 

0002 

By Principal 


Sum 

7 95 

+5 63 

Investigator 


Vs 

1 73 

1 73 



z 

4 60 

+3 25 



P = 

0000025 

0006 


-Indicates that experimenter expectancy interacted mth another variable at 
z > /12S/. 
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Half the expenmcnters were led to expect some of their subjects to 
give few Rorschach responses but proportionately a lot of human re- 
sponses Results showed that subjects who were expected to give more 
responses gave more responses (z = +155) and that subjects who were 
expected to give a greater number of animal relative to human responses 
did so (z = +2 04) Marwit also found trends for the first few responses 
to have been already affected by the experimenter’s expectancy and for 
later-contacted subjects to show greater effects of experimenter ex- 
pectancy than earlier-contacted subjects 

In summarizing the four inkblot expenments we can say that three 
investigations obtained substantial effects of experimenters’ expectancies 
while one investigation did not Perhaps we can account for the differ- 
ences in results in terms of differences in procedure In the three experi- 
ments showing expectancy effects, the expectancy induced was for a 
more simple Rorschach response — animal or human content in one study, 
number of responses m another, and both in the third In the study 
showing no significant expectancy effect, the expectancy induced was 
for a more complex response, one involving a relationship of two response 
categories to one another, human movement and color 

In the three experiments showing expectancy effects, each expen- 
menter entertained the same hypothesis for each of his experimental 
group subjects In the experiment not showing the expectancy effect, 
each experimenter contacted subjects under opposite conditions of expec- 
tation There have been many expenments with human and animal sub- 
jects showing that expectancy effects may occur even when the different 
expectations are held m the mind of the same expenmenters There 
are, however, a number of studies showing that under these conditions, 
from 12 to 20 per cent of the experimenters show significant reversals 
of expectancy effect The word significant is italicized to emphasize 
that we speak not of failures to obtain data m the predicted direcbon, 
but of obtaining data opposite to that expected with non Gaussian gusto 
(Rosenthal, 1967c, 520) In the study by Strauss not showing expectancy 
effects on the obtained Rorschach expenence balance, such extreme re- 
versals were not obtained but the sample of expenmenters was small 
(five) 

Finally, in the study not showing expectancy effects, the expenmenters 
Were more expenenced than those of the studies that did show expec- 
tancy effects There is, however, some other evidence to suggest that 
more expenenced, more competent, and more professional expenmenters 
may be the ones to show greater rather than smaller expectancy effects 
(Rosenthal, 1966). 

Pending the results of additional research, perhaps all we can now 
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In his experiment, Strauss employed five female experimenters each 
of whom was to administer the Rorschach to six female undergraduates 
For hvo of the subjects each experimenter was led to expect an mtro- 
\ersive” expenence balance, for another two of the subjects each experi- 
menter \\ as led to expect an ‘extratensn e ’ expenence balance, and for 
the remaining two subjects, expenmenters were given no expectations 
Subjects’ actual expenence balance, measured by the difference between 
relative standings m human movement (M) and color (Sum C) produc- 
tion, vv as found overall to be unrelated to the experimenter’s expectancy 
There was an interesting difference, however, in the vanabihty of ob- 
tained responses across the three treatment groups ( F max = 19 22, 
p < 05 ) , with the control group of subjects for whom expenmenters 
had been given no expectation showing least mdividual differences 
among experimenters Relative to the control group subjects, subjects 
contacted by the different expenmenters m both conditions of expectancy 
obtained experience balance scores that were both too high and too 
low 

As a check on the success of the mducbon of the expectations, Strauss 
asked his experimenters to predict the expenence balance that would 
be obtained from each subject The analysis showed very clearly that, 
on the average, expenmenters predicted expenence balance scores very 
much in line vvith those they had been led to expect In addition, how- 
ever, an interaction effect was obtamed (z > 2 58) which showed that 
one of the five expenmenters predicted results opposite to those he 
had been led to expect, while another predicted an unusually great 
difference between his introversive and extratensive subjects, but m the 
nght direction These two extreme predictors, it turned out, both 
showed a tendency to obtain responses opposite to those they had been 
led to expect (mean expectancy effect m standard score units « —174). 
The remammg three expenmenters all obtained more posihve effects 
(mean expectancy effect in standard score units = -j-1^'^5) With so 
small a sample of expenmenters {df « 3) such a comparison (z *= 1 44) 
can be at best suggestive, but it may serve to alert other mvesbgators 
to the interesting possibility that expenmenters, when given a prophecy 
for a subject’s behavior, may be more likely to fulfill that prophecy 
if they believe neither too much nor too little that the Dronheev will 
be fulfilled i' / 

In the most recent of the inkblot expenments, Manvit employed 20 
graduate students in chnical ps>chology as his expenmenters and 40 
undergraduate students of mtroductory psychology as his subjects Half 
the expenmenters were led to expect some of their subjects to give 
many Rorschach responses and especially a lot of animal responses 
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Half the experimenters were led to expect some of their subjects to 
give few Rorschach responses but proportionately a lot of human re- 
sponses Results shoived tliat subjects who ivere expected to give more 
responses gave more responses (z = 4-155) and that subjects who were 
expected to give a greater number of animal relative to human responses 
did so (z = 4-2 04) Marwjt also found trends for the first few responses 
to have been already affected by the experimenter’s expectancy and for 
later-contacted subjects to show greater effects of experimenter ex- 
pectancy than earlier contacted subjects 

In summarizing the four inkblot experiments we can say that three 
investigations obtained substantial effects of experimenters’ expectancies 
while one investigation did not Perhaps we can account for the differ- 
ences in results in terms of differences in procedure In the three expen 
ments showing expectancy effects, the expectancy induced was for a 
more simple Rorschach response — animal or human content m one study, 
number of responses in another, and both in the third In the study 
showing no significant expectancy effect, the expectancy induced was 
for a more complex response, one involving a relationship of two response 
categories to one another, human movement and color 

In the three experiments showing expectancy effects, each experi- 
menter entertamed the same hypothesis for each of his experimental 
group subjects In the experiment not showing the expectancy effect, 
each experimenter contacted subjects under opposite conditions of expec 
tation There have been many experiments with human and animal sub 
jects showing that expectancy effects may occur even when the different 
expectations are held in the mind of the same experimenters There 
are, however, a number of studies showing that under these conditions, 
from 12 to 20 per cent of the expenmenters show significant reversals 
of expectancy effect The word significant is italicized to emphasize 
that we speak not of failures to obtain data in the predicted direction, 
but of obtaining data opposite to that expected with non Gaussian gusto 
(Rosenthal, 1967c, 520) In the study by Strauss not showing expectancy 
effects on the obtained Rorschach experience balance, such extreme re- 
versals were not obtained but the sample of expenmenters was small 
(five) 

Finally, in the study not showing expectancy effects, the experimenters 
were more experienced than those of the studies that did show expec- 
tancy effects There is, however, some other evidence to suggest that 
more experienced, more competent, and more professional expenmenters 
may be the ones to show greater rather than smaller expectancy effects 
(Rosenthal, 1966) 

Pending the results of adclitionil *'haps all we can now 
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say IS that some inkblot responses may, under some conditions, be fairly 
susceptible to the effects of the expenmenter’s expectancy For the set 
of experiments described here, the overall p level can be brought to 
the 05 level by the addition of 15 new results of an average directional 
z value of 00 For the set of pnncipal investigators, the overall p level 
can be brought to the 05 level by the additional results of nine pnncipal 
investigators obtaining an average directional z value of 00 

Later we shall have occasion to discuss more systematically the results 
of studies of experimenter expectancy as a function of the laboratory 
in which they were conducted For now it should only be mentioned 
that the one study of inkblot responses showing no overall directional 
effect of experimenter expectancy was the one conducted m the writer’s 
laboratory 

E. Structured Laboratory Interviews 

Table X shows the per investigation and per investigator results of 
the research in what must be the most miscellaneous of our research 
areas In one of the earliest of these studies, Pflugrath investigated the 
effects of the experimenter’s expectancy on scores earned on a stan- 
dardized paper and pencil test of anxiety He employed nine graduate 
student counselors, each of whom was to administer the Taylor Manifest 
Anxiety Scale to two groups of students of introductory psychology 
In each group there was an average of about eight subjects Three 


TABLE X 

Expectancy Effects in Studies of SmucrunED Laboratory Interviews 



Study 



Standard normal deviate 

Code number 

Authors 



Nondirectional 

Directional 

1 

Cooper, E, R, D, 

1967 

I 



2 

Jenkins, 

I9G6 

I 


+ 1 34'» 


Pflugrath, 

1902 

I 



4 

RafTetto, 1967 

1068 

I 


+ 5 24“ 

5 

Rosenthal, P, V, F, 

1963b 

I 


+ 1 48 


Timaeus and LUck, 

1968a 

I 

1 55 

-f 1 55 



Sum 


14 73 

+12 98 





2 45 

2 45 



Z 


6 01 

+ 5 30 



P 


Hoo million 

million 


experimenter expccUncy mterncted ivith another variable at 
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of the expenmenters were led to believe that their groups of subjects 
were very anxious, three ^^ere led to believe that tlieir groups of subjects 
were not at all anxious, and three were given no expectations about 
their subj'ects’ anxiety level 

Pllugrath found no difference among the three treatment conditions 
in his analysis of vanance but his X* was large The bulk of the obtained 
X" was due to the fact that, among experimenters led to expect high 
anxiety scores, more subjects actually scored lower in anxiety This find- 
mg, while certainly not predicted, was at least interpretable in the light 
of the experimenters’ status as counselors in-training Told that they 
would be testing very anxious subjects who had required help at the 
counseling center, these expenmenlers may well have brought their de- 
veloping therapeutic skills to bear upon the challenge of reducing these 
subj’ects’ anxiety. If the subject’s performance on even well-standardized 
paper-and-pencil tests, administered in a group situation, may be affected 
by the experimenter’s perception of the subject, then it is not unreason- 
able to suppose that such effects may occur with some frequency in 
the more intense and more personal relationship that characterizes the 
more typical clinical assessment situation 

As a check on the success of his experimental induction of expec- 
tancies, Pflugrath asked his expenmenters to predict the level of anxiety 
they would actually find in each of their groups of subjects Although 
there was a tendency for examiners to predict the anxiety level that 
they had been led to expect, this tendency did not reach an associated 
z of 4-1 28 The experimentally manipulated expectations, then, were 
not very effectively induced It is of interest to note, however, that 
all three of the expenmenters who specifically predicted higher anxiety 
obtained higher anxiety scores than did any of the experimenters who 
specifically predicted lower anxiety (z > -j-1 65) Because the results 
of the Pflugrath experiment showed some effects in the predicted direc- 
tion and some effects in the opposite direction the directional z is entered 
as 00 in Table X The nondirectional z, however, retains the information 
that some differences were associated with the effects of experimenter 
expectancy 

The experiment by Raffetto was addressed to the question of whether 
the experimenter’s expectation for greater reports of hallucinatory be- 
havior might be a significant determinant of such reports 

Raffetto employed 96 paid, female volunteer students from a vanety 
of less advanced undergraduate courses to participate m an expenment 
on sensory restriction Subjects were asked to spend one hour in a small 
room that was relatively free from light and sound Eight more advanced 
students of psychology served as the expenmenters, with each one inter- 
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Viewing 12 of the subjects before and after the sensory restriction experi- 
ence The preexperimental interview consisted of factual questions such 
as age, college major, and college grades The postexperimental inter- 
view was relatively well structured including questions to be answered 
by “yes” or “no” as well as more open-ended questions — e g , “Did you 
notice any particular sensations or feelings?’ Postexperimental interviews 
were tape-recorded 

Half the experimenters were led to expect high reports of hallucinatory 
experiences, and half were led to expect low reports of hallucinatory 
experiences Obtained scores of hallucinatory experiences ranged from 
zero to 32 with a grand mean of 5 4 Of the subjects contacted by 
experimenters expecting more hallucinatory experiences, 48 per cent 
were scored above the mean on these experiences Of the subjects con- 
tacted by experimenters expecting fewer hallucinatory expenences, only 
6 per cent were scored above the mean 
Since in this experiment the experimenters scored their own interviews 
for degree of hallucinatory expenence, it is possible that scoring errors 
accounted for part of the massive effects obtained It seems unlikely, 
however, in the light of what we now know of such errors that effects 
as dramatic as these could have been due enbrely to scoring errors 
even if such errors were very great Fortunately this question can be 
answered m the future since Raffetto did tape record the interviews 
conducted so that they can be rated by “blind observers When Raffetto 
himself checked the experimenters’ scoring he found no significant scor 
ing errors, but we must note, as did Raffetto, that he was not blind 
to the interviewers’ condition of expectancy The work of Beez (1968), 
however, amply documents the fact that such dramatic effects of expec 
tancy may occur even in the absence of scoring errors 

In the expenment conducted by Rosenthal ef al , 18 graduate students 
served as expenmenters in a study of verbal conditioning conducted 
with 65 undergraduate subjects Half the experimenters were led to 
expect from their subjects high rates of awareness of having been condi- 
tioned, while the remaining experimenters were led to expect low rates 
of awareness of having been conditioned Questionnaires assessing sub- 
jects’ degree of awareness were scored blindly by two psychologists 
Of the subjects expected to show a low degree of awareness, 43 per cent 
ere subsequently judged as ‘ aware ” Of the subjects expected to show 
a high degree of awareness, 68 per cent were subsequently judged as 
“aware ” 

In the experiment by Timaeus and Luck of the University of Cologne, 
subjects were asked to estimate the level of aggression to be found 
in a Milgram type expenment When expenmenters had been led to 
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expect high levels of aggression, they obtained higher levels of aggression 
than when experimenters had been led to expect lower levels of aggres 
Sion Jenkins found in her expenment that factual information about 
a stimulus person could be communicated unmtentionally from experi 
menter to subject Subjects contacted by experimenters believing one 
set of factual statements to be true of a stimulus person more often 
also believed those factual statements to be true of the stimulus person 
than did subjects contacted by experimenters believing the opposite 
factual statements to be true of the stimulus person The expenment 
by Cooper et al is one we shall consider later in more detail Their 
research showed that the degree of certainty of having to take a test 
as a function of the degree of preparatory effort was successfully pre 
dieted from a knowledge of the experimenters' expectations 

Considenng the results of these expenments (or principal investiga- 
tors) as a set, it would require 56 additional studies (or investigators) 
findmg a mean directional z of 00 to bnng the overall p level to 
05 


F. Person Perception 

Table XI shows the results of 57 studies of expectancy effect in which 
a standardized task of person perception was employed Table XII shows 
the analogous results based not on studies but on principal investigators 
The basic paradigm of these investigations has been sufficiently uniform 
that we need only an illustration (Rosenthal and Fode, 1963b I) 

Ten advanced undergraduate and graduate students of psychology 
served as the experimenters AH were enrolled m an advanced course 
in expenmental psychology and were already involved m conducting 
research Each student experimenter was assigned as his subjects a group 
of about 20 students of introductory psychology The expenmental pro 
cedure was for the experimenter to show a senes of ten photographs 
of people’s faces to each of his subjects individually The subject was 
to rate the degree of success or failure shown in the face of each person 
pictured in the photos Each face could be rated as any value from 
—10 to -flO, with —10 meaning extreme failure and +10 meaning 
extreme success The 10 photos had been selected so that, on the average, 
they were rated as neither successful nor unsuccessful, but rather as 
neutral with an average numerical score of zero 

All ten experimenters were given identical instructions on how to 
administer the task to their subjects and were given identical instructions 
to read to their subjects Tliey were cautioned not to demte from llicsc 
instructions Tlie purpose of their participation, it was explained to all 
expenmenters, was to sec how veil the} could duplicate experimental 
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TABLE XI 

Expectancy Effects in Studies of Person Perception 


Study 


Code 

number Authors 


Standard normal deviate 


Nondirectional Directional 


1 

Adair and Epstein, 

1967 

I 

1 65 

-h 1 65 

2 

Adair and Epstein, 

1967 

II 

1 64 

+ 1 64 

3 

Adler, 

1968 

I 

4 42 

+ 4 42 

4 

Adler, 

1968 

II 

2 33 

- 2 33 

5 

Adler, 

1968 

III 

I 50 

- 1 50 

6 

Barber, C, F, M, C, B, 

1967 

I 

00 

OO* 

7 

Barber, C, F, M, C, B, 

1967 

II 

00 

00 

8 

Barber, C, F, M, C, B, 

1967 

III 

00 

00 

9 

Barber, C, F, M, C, B, 

1967 

IV 

00 

00 

10 

Barber, C, F, M, C, B, 

1967 

V 

1 58 

00 

11 

Bootzin, 

1968 

I 

2 14 

+ 2 14^ 

12 

Bootzm, 

1968 

II 

1 64 

- 1 64» 

13 

Bootzin, 

1968 

III 

1 44 

+ 1 44» 

14 

Carlson and Hergenhahn, 

1968 

I 

00 

00» 

15 

Carlson and Hergenhahn, 

1968 

II 

00 

00^ 

16 

Connors, 

196$ 

I 

00 

00» 

17 

Connors and Horst, 

1966 

I 

00 

00» 

IS 

Pode, 

1967 

I 

2 81 

+ 2 81» 

19 

Horn, 

1968 

I 

2 01 

+ 2 01 

20 

Jenkins 

1966 

I 

1 61 

+ I 61» 

21 

Laszlo and Rosenthal 

1967 

I 

1 80 

+ 1 80^ 

22 

Marcia, 

1961 

I 

00 

00^ 

23 

Marcia, 

1961 

II 

00 

00» 

24 

McFall, 

1965 

I 

00 

00^ 

25 

Moffatt, 

1966 

I 

00 

00 

26 

Nichols, 

1967 

I 

00 

00» 

27 

Persmger, 

1962 

I 

00 

00* 

28 

Persinger, K, R, 

1966 

I 

1 88 

4* 1 88* 

29 

Persmger, K, R, 

1968 

I 

1 64 

+ 1 64* 

30 

Rosenthal and Fode, 

1963b I 

2 46 

+ 2 46 

31 

Rosenthal and Fode, 

1963b 

II 

3 94 

+ 3 44 

32 

Rosenthal and Fode, 

1963b III 

1 64 

+ 1 64* 

33 

Rosenthal, F, J, F, S, W, V, 

1964 

I 

1 52 

- 1 52* 

34 

Rosenthal, K, G, C, 

1965 

I 

1 69 

+ 1 69* 

35 

Rosenthal, K, G, C, 

1965 

ir 

• 00 

00* 

36 

Rosenthal and Persmger, 

1968 

I 

1 29 

4- 1 29* 

37 

Rosenthal and Persmger, 

1968 

II 

00 

00* 

38 

Rosenthal, P, M, V, G, 

1964a I 

1 44 

4- 1 44* 

39 

Rosenthal, P, M, V, G, 

1964a IZ 

00 

00* 

40 

Rosenthal, P, M, V, G, 

1964b I 

2 33 

00* 

41 

Rosenthal, P, M, V, G, 

1964b II 

2 58 

00 
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TABLE XI (continued) 



Study 



Standard normal deviate 

Code 

number 

Authors 



Nondircctional 

Directional 

42. 

Rosenthal, P, V, F, 

19G3a I 

2 17 

+ 2 33” 

43. 

Rosenthal, P, V, 

1963 

I 

00 

00 

44. 

Rosenthal, P, V, M, 

19G3 

11 

1 96 

+ 1 96^ 

45. 

Shames and Adair, 

19G7 

I 

1 70 

-1- 1 70 

46. 

Smiltcns, 

1966 

I 

1 28 

- 1 28” 

47. 

Trattner, IflOG, 

19G8 

I 

00 

00” 

48. 

Uno, F, R, 

1968 

I 

1 99 

- 1 99” 

49. 

Uno, F, R, 

1968 

II 

00 

00” 

50. 

Uno, F, R, 

1968 

III 

00 

00 

51. 

Uno, F, R, 

1968 

IV 

2 17 

- 2 17» 

52. 

Uno, F, R, 

1968 

V 

00 

00 

53. 

Weick, 

1966 

I 

2 33 

+ 2 33^ 

54. 

Wessler, 

1968b I 

00 

00 

55. 

Wessler and Strauss, 

1968 

I 

1 65 

00 

56. 

White, 

1962 

I 

2 81 

- 1 5P 

67. 

Woolsey and Rosenthal, 

1966 

I 

1 34 

+ 1 3i” 




Sum 

68 38 

-f30 72 




V57 

7 55 

7 55 




z 

9 06 

-1- 4 07 




V < 

l/(milhon)* 

000025 


“ See also Rosenthal, 1967d, 1968b 

^ Indicates that experimenter expectancy interacted with another variable at 
2 > /1.28/. 


results which were already well-established. Half the experimenters were 
told that the "well-established” finding was such that their subjects 
should rate the photos as being of successful people (ratings of -f-S) 
and half the experimenters were told that their subjects shou rate 
the photos as being of unsuccessful people (ratings of — 5). Resulte 
showed that experimenters expecting higher photo ratings obtained 
higher photo ratings than did experimenters expecting lower photo rat- 
ings. Although all of the other experiments shown in Table XI were also 
intended as replications of the basic finding, most of the work sum- 
niarized was designed particularly to learn something of e con i ons 
■which increase, decrease, or otherwise modify the effects of expenmenter 
expectancy. That intent has characterized the work of 18 of the 20 princi- 
pal investigators listed in Table XII. It was the role of auditory cues, 
for example, that engaged the interest of Adair and Epstein. 
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TABLE XII 

Expectancy Effects in Studies of Person Perception 
BY PniNCiPAi. Investigators 


Principal investigator 

Standard normal deviate 

Code 




number 

Name 

Nondirectional 

Directional 

I 

Adair 

2 88 

-f 2 88 

II 

Adler 

4 77 

00 

III 

Barber 

1 40 

00 

IV 

Bootzin 

3 02 

00 

V 

Carlson 

00 

00 

VI 

Connors 

00 

00 

VII 

Eode 

2 81 

+ 2 81 

vni 

Horn 

2 01 

-f 2 01 

IX 

Jenkins 

1 61 

+ 1 61 

X 

Marcia 

00 

00 

XI 

McFall 

00 

00 

XII 

Moffat 

00 

00 

XIII 

NichoU 

00 

00 

XIV 

Persinger 

00 

00 

XV 

Rosenthal 

6 91 

+ 3 52 

XVI 

Smiltens 

1 28 

- 1 28 

XVII 

Trattner 

00 

00 

xvm 

Weick 

2 33 

-f 2 S3 

XIX 

Wesaler 

00 

00 

XX 

White 

2 81 

- 1 51 


Sum 

31 83 

-1-12 37 



4 47 

4 47 


z 

7 12 

+ 2 77 


V < 


<m 


They first conducted a study which was essentially a replication of 
the basic experiment on the self-fulfilling effects of expenmenters’ hy- 
potheses. Results showed that, just as in the original studies, experi- 
menters who expected the perception of success from their subjects 
fulfilled their expectations as did the experimenters who had prophesied 
the perception of failure by their subjects 

During the conduct of this replication experiment, Adair and Epstein 
tape-recorded the experimenters mstructions to their subjects The sec- 
ond experiment was then conducted not by “live” experimenters, but 
by tape-recordings of experimenters’ voices reading standard instructions 
to their subjects. When the tape-recorded instructions had origmally 
been read by experimenters expecting success perception by their sub- 
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jects, the tape-recordings evoked greater success perceptions from their 
subjects When the tape-recorded instructions had origmally been read 
by experimenters expecting failure perception by their subjects, the 
tape-recordings evoked greater failure perceptions from their subjects 
Self-fulfilling prophecies, it seems, can come about as a result of a proph- 
et's voice alone Smce, in the experiment described, all expenmenters 
read standard instructions, self-fulfillment of prophecies may be brought 
about by the tone in which the prophet prophesies 

Adler, in her recent research, investigated the effects on experimenter 
expectancy of several experimenter sets or onentations toward outcomes 
When experimenters were made to feel that it was important to obtain 
certain results, expenmenters obtained the expected results When ex- 
penmenters were made to feel that it was very important to follow 
certain scientific procedures, they obtained results significantly opposite 
to those that they had been led to expect In the control condition, 
in which no special orientations toward outcome were specially gener- 
ated, expenmenters also showed the reversal tendency For the particular 
sample of expenmenters and subjects employed, it seems possible that 
a general process-consciousness was operating that contnbuted to the 
reversal effect among the expenmenters of the control group 

Many of the experiments hsted m Table XI with an associated direc- 
tional 2 < 4-1 28 showed one or more interaction effects of expenmenter 
expectancy and some other variable These interactions and those found 
between expenmenter expectancy and other vanables in the earlier de- 
scnbed research areas will not be descnbed here but will be drawn 
upon m a later discussion of factors complicating the effects of expen- 
nienter expectancy 

For many of the expenments bsted with 2 < 4"1 28 there is no rea y 
explanation for the low 2 but sometimes the design of the expenment 
was intentionally such as to minimize the effects of expenmenter expec- 
tancy Thus Carlson and Hergenhahn (II) interposed a screen between 
expenmenters and subjects and used a tape recorder to administer in- 
structions to subjects (I, II), both of these procedures having been 
^^SS^^sted as techniques for the reduction of expectancy effects (Rosen 
thal, 1966) Similarly, Moffat’s expenmenters were made to remain mulc^ 

It has been suggested that higher status experimenters may ® 
greater expectancy effects (Rosenthal, 1966) In only nine of the studies 
listed in Table XI was no attempt made to have the expenmenters 
exceed their undergraduate subjects m class standing, age, or training 
*n ps)chology (Bootzin I, II, Carlson and Hergenhahn, I, II. Barber 
ul> I, II, III, IV. V). Only one of these nine studies, or 11 per cent, 
showed a directional z of 4-1 28 or greater, about what we might expect 
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by chance Of the remaining 45 studies employing college samples 
(Persinger, Knutson, and Rosenthal, 1966, 1968, and Trattner, 1966, 1968, 
employed neuropsychiatric pahents), 19, or 42 per cent, showed an 
associated z of +1 28 or greater Apparently, for college samples, when 
the expenmenter’s status exceeds that of his subject in the person percep 
tion task, the chances are almost quadrupled that expectancy effects 
will be obtained compared to the situation in which experimenter and 
subject are of the same status This latter situation, of course, is relatively 
rare not only in our sample of studies but also in the real world of 
laboratory experiments 

On the whole, the person perception task seems less susceptible to 
the effects of experimenter expectancy than most of the other areas 
investigated though the large number of studies conducted makes the 
overall combmed p a fairly stable one It would take the addition of 
278 studies (or 36 principal investigators) with a mean directional z 
of 00 to bring the overall p to the 05 level Compared to all other 
research areas combmed, however, the person perception task shows 
fewer directional z results of -fl28 or greater (X® = 6 51, pS 01) 

V AN OVERVIEW OF EXPECTANCY EFECTS 

Now that we have considered the results of studies of expectancy 
effects for seven areas of research, it will be convenient to have a sum 
mary Table XIII presents such a summary based on experiments, and 
Table XIV presents a summary based on principal investigators For 
each research area the combined zs, both nondirectional and directional, 
are given as well as the per cent of the studies (or investigators) that 
reached the specified value of z The next to last row of Tables XIII 
and XIV give the grand overall zs based on all studies and all investiga- 
tors The expenment by Wessler (1968) was represented in each of 
three research areas and that by Jenkins in each of two research areas 
For each of these studies the mean z was used as the entry m the 
next to last row of Tables XIII and XIV in order to have each entry 
based on independent samples Based either on these overall zs of all 
studies (or all investigators), or on the results of the binomial tests 
shown in the last row of Tables XIII and XIV, the overall p associated 
with expectancy effects is infinitesimally small In both tables it can 
be seen that though the combmed nondirectional zs are larger than 
the combmed directional zs in the next to last row, the directional zs 
are larger when based on the binomial test The reason, of course, 
IS that we expect twice as many of the nondirectional zs to reach a 
given magnitude and the binomial test knows that fact 
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TABLE XUI 

Expectancy Effects in Seven Research Areas 


Nondirectional z Directional 


Research area 

Studies 

2 

% > /I 28/ 

2 

% > + 1 28 

Animal Learning 

9 

8 49 

100% 

+ 8 64 

100% 

Learning and Ability 

9» 

3 01 

44% 

+ 3 01 

44% 

Psychophysical Judgments 

9“ 

3 44 

56% 

-f- 2 5o 

33% 

Reaction Time 

3 

1 93 

67% 

+ 1 93 


Inkblot Tests 

4 

4 71 

100% 

+ 3 55 

7o% 

Laboratory Interview s 

6» 

6 01 

100% 

+ 0 30 

83% 

Person Perception 

57-^ 

9 06 

60% 

+ 4 07 

39 % 

All Studies 

94« 

14 3d 

67% 

+ 9 82 

d0 % 

Binomial test 2 (N = 94) 


11 39 


+ 12 92 



® Indicates a single experiment represented in each of three areas 

* Indicates a different experiment represented m each of two areas 

* Three entries were nonindependent and the mean z across areas i\as u cd for t 
independent entry 


TABLE XIV 

Expectancy Effects in Seven Research Areas by Principal In’vestigators 


Nondirectional z Directional z 


Research area 

In\ estigators 

z 

% > /I 28/ 

z 

% > + ' 28 

Animal Learning 

5 

8 2d 

100% 

+8 40 

100% 

Learning and Ability 

S'* 

2 96 

50% 

+2 45 
+ 1 93 


Psjchophjsical Judgments 

6“ 

3 d8 

07% 


Reaction Time 

3 

1 93 

67% 


Inkblot Tests 

3 

4 GO 

100% 

+ D 10 


Laboratorj Interviews 

0^ 

G 01 

100% 


Person Perception 

20'’* 

7 12 




All Imestigators 

Rmomial test z = 48) 

48' 

13 2S 

8 SI 

71 % 

+ 9 ')D 

+*) 71 



• Indicates a sinRle in\ estiRator represented in each of tlin-c nms hj th 

jeet sample 

‘Indicitps nnolher in\ cstigntor rcprrrnlMl in too n”"" '' ‘ 

“ample 

' Tlirce entries v ere non indciiendent and the mem z nrro's area 

independent entrj 
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In companng the hkehhoods of expectancy effects m the various re- 
search areas, the use of the zs may be somewhat misleading A large 
number of investigations with only a moderate!}' large number of zs 
reaching a specified level will make for a very large s, while a small 
number of investigations with a relatively large number of zs reaching 
a specified level will make for a smaller z For this reason, the percentage 
of zs reachmg a specified level may be a better basis on which to com- 
pare the likelihood of expectxmt^ effects in the various research areas 
By chance, we expect 10 per cent of the directional zs to reach or 
exceed +128 but half of all directional zs reach that value Effects 
of the experimenters expectancy are found most often in studies of 
animal learning, laboratory interviews, inkblot tests, and reaction time 
They are found least often in studies of psychophysical judgments and 
person perception, and about half the time in studies of human learmng 
and abihty 

There is one sense in which some of the entnes of Table XIV are 
not independent Some of the pnncipal investigators conducted expen 
ments in more than one area of research In addibon, we have so far 
considered as pnncipal investigators anyone reporting an expenment 
regardless of the laboratory of ongm For these reasons it was felt 
to be instructive to summanze Uie results of all experiments conducted 
in different laboratories ivith each laboratory given equal weight with 
every other Table XV hsts 29 laboratones and the pnncipal investigator 
associated witli each Again the overall probabilities are very low and 
the median laboratory had about two thu’ds of their experimental results 
reach a directional z value of +1 28 compared to the 10 per cent we 
would expect by chance While we expect one of the 29 laboratones 
to show a directional z of +1 82 or greater by chance, Table XV shows 
that 15 of the Jahorafones ohfamed zs of that value or greater Tfiougfi 
with so many laboratories we would expect one directional z of —1 82 
to occur by chance, it is of interest to note that the one negative z 
of that size was obtained m the laboratory of a different culture — ^Japan 

Because so much of the busmess of the behavioral sciences is transacted 
at certam specified p levels, the percentage of experiments and of labora- 
tones reaching each of a set of standard p levels is shown m Table 
XVI " In addition, the last row shows the number of future rephcates 
obtaining a directional mean z of exactly 00 required to bnng the 
overall p to the 05 level 

* Since the preparation of this chapter another nine experiments, by four principal 
investigators became available (Becker, 1968 Minor, 1967, 1967a, Peel 1967, 
Zegers, 1968) The combined p for the rune experiments was 03, (S z = -J-559) 
for the four mvesbgators p < 04 (S a = +3 58) 
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TABLE XV 

Expectanc\ Effects Obtained in Different Ladobatories 


Nondirectional z Directional z 


Imestigalor Location Studies z % > /1 28/ z % > + 1 28 


1 

Adair 

Manitoba 

G 

2 04 

50% 

+ 

2 

Adler 

•\VcllesIej 

3 

4 77 

100% 


3 

Barber 

Medfield* 

5 

1 40 

20% 


4 

Bootzm 

Purdue 

3 

3 02 

100% 


5 

Burnham 

Earlham 

1 

1 50 

100% 

+ 

6 

Carlson 

Hamline 

2 

00 

0% 


7 

Cooper 

CC, CUNY 

1 

3 37 

100% 

+ 

8 

Getter 

Connecticut 

1 

00 

0% 


9 

Harrington 

Iowa State 

2 

2 54 

100% 


10 

Hartry 

Occidental 

2 

6 15 

100% 

•h 

11 

Horn 

Geo Washington 

1 

2 01 

100% 

+ 

12 

Ison 

Rochester 

2 

5 n 

100% 


13 

Johnson 

New Brunswick 

1 

3 89 

100% 

+ 

14 

Kennedy 

Tennessee 

2 

1 61 

50% 

+ 

15 

Larrabee 

South Dakota 

1 

1 60 

100% 

+ 

16 

Marcia 

SUNY, Buffalo 

2 

3 58 

100% 

+ 

17 

Masling 

SUNY, Buffalo 

2 

2 44 

100% 

+ 

18 

McFall 

Ohio State 

2 

00 

0% 


19 

Moffat 

British Columbia 

1 

00 

0% 


20 

Persinger 

Fergus Falls'* 

2 

2 50 

300% 

+ 

21 

Raffetto 

San Francisco State 

1 

5 24 

100% 

+ 

22 

Rosenthal 

Harvard 

35 

8 04 

69% 

+ 

23 

Silverman 

SUNY, Buffalo 

1 

1 88 

100% 

+ 

24 

Timaeus 

Cologne 

3 

1 98 

67% 


25 

Uno 

Keio (Tokyo) 

5 

1 86 

40% 


26 

Wartenberg- 







Ekren 

Marquette 

1 

00 

0% 


27 

Weick 

Purdue 

1 

2 33 

100% 

+ 

28 

Wessler 

St Louis 

3 

1 80 

67% 


29 

Zoble 

Franklin and 

2 

3 77 

100% 

+ 


Marshall 


2 04 » 
00 
00^ 
00^ 

1 95 
00» 

3 37 
00 

2 54 
6 15 
2 01 
5 11 

3 89» 
1 61» 
1 60 

3 58 

1 45 ® 
00> 
00 

2 50 ^ 
5 24^ 

4 83» 
1 88 » 

00» 
1 86^ 

00 
2 33 
00» 
4 06 


Sum 

s/Vm 

Means 
Medians 
Binomial test z 


74 43 
13 81 

2 57 71% 
2 04 100% 
8 47 


+54 28 
+10 07 
+ 1 87 
+ 1 88 
9 32 


(N = 29) 


50% 

33% 

0% 

67% 

100 % 

0% 

100% 

0% 

100 % 

100% 

100% 

100% 

100% 

50% 

100% 

100% 

50% 

0% 

0% 

100% 

100 % 

49% 

100 % 

33% 

0% 


0% 

100 % 

33% 

100 % 


61% 

67% 


“ State Hospitals ,, . 

‘Indicates that experimenter cxpectancj interacted with other ia 

* > /I 28/ 



232 


ROBERT ROSENTHAL 


TABLE XVI 

PEHCEOTACE OF ExPERTHENTS A’«D LaDOILATOIUES 

Obtaining Results at Specified p Levels 


p 

Experiments 

Laboratories 


N = 04 

Cl 

II 

10 

50% 

G2% 

05 

35% 

52% 

01 

17% 

38% 

001 

12% 

28% 

0001 

5% 

21% 

00001 

3% 

14% 

OOOOQl 

2% 

14% 

Grand Sum z 

95 27 

54 28 

Tolerance for Future 
Negative Results® 

3,200 

1,000 


“ Replicates required to bring overall p to Oo, assuming all replicates to yield a 
mean z of 00 exactly 


Eatlier, the possibility was raised that certain judgments and computa* 
tions made by the present writer might be m error so that a correction 
factor for these errors would be desirable In addition, there is the possi- 
bility that studies showing no effects of experimenter expectancy might 
be less likely to be reported or called to the attention of the writer 
This latter possibility cannot be ruled out m any way, though, at the 
time of this writing, interest in publication of “negative findings” seems 
as great as interest in publication of “positive findings” of expectancy 
effects 

As a fairly stringent correction for the possibility of the writer’s errors 
and for the possibility of a biased availability of studies, we assume 
that the total number of experiments and of laboratories is ten times 
greater than that reported here The factor of ten was selected on the 
basis of the widespread, intentionally exaggerated, and perhaps cynical 
fear among behavioral scientists that any given critical value of p gives 
the proportion of experiments conducted that come to public knowledge 
(Rosenthal, 1966) Since the directional z defined as worth listmg m 
this review was that associated with a p of 10, the factor of ten was 
selected If we assume that instead of 94 experiments conducted as 
tests of the hypothesis of expectancy effects there were actually 940 
conducted, what becomes of the overall combined z? It goes to •4'3 
p < 001, assuming that the additional 846 experiments found a mean 
directional z of zero exactly Similarly, if we assume that instead of 
29 investigating laboratories there were 290, the overall combined z 
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for laboratories goes to -f 3 18, p < 0008, assuming that the additional 
261 laboratories found a mean directional z of zero exactly 
Additional protection against any errors leading us to entertain viith 
insufficient basis the possibility of expectanc) effects comes from consid 
enng any z jl 28/ to be a s of zero As Tabic XVI shows, the distribu- 
tion of zs IS highly skewed such that too many are very much greater 
than zero It seems most likely, therefore, though the check would have 
been an onerous task, that the bulk of the zs considered as equal to 
00 was actually also skewed such as to give too many zs of positive 
value 


A. Principal Investigators 

With so many experiments m a series it becomes possible to examine 
the relationship between outcome and vanous characteristics of the prin 
cipal investigator Of the 94 experiments, 18 were conducted primanly 
by female principal investigators Of these studies 44 per cent yielded 
directional zs of +1 28 or greater compared to the 51 per cent of studies 
conducted by male pnncipal investigators, a difference which is qui e 
trivial (X* = 07) ^ , 

In 37 of the 94 expenments, the principal investigator was a student, 
and in 43 per cent of these studies the directional z reached or exceeded 
+1 28 In those studies in which the senior investigator was not a student 
(eg a faculty member), 54 per cent of the results reached 
that value of z The difference, however, was very small in the sens 


For reasons described in greater detail elsewhere (Rosenthal, ) 
It was felt to be desirable to compare the outcomes of 
conducted m the present writers laboratory with those Mn uc e e 
where Of the 35 expenments conducted in the writer s labora 
per cent showed a directional z of +1 28 or greater ® «ri,h^trlp 
ments conducted elsewhere, 51 per cent showed of th^t ^ 
The difference between these percentages was trivia ( - - 

Of the 35 expenments conducted m the writers laboratoiy 15 were 
conducted by students, and of these 15 ^ 

27 per cent, showed a directional z of +1 28 or grea cr . p P 

was ,ust half the proportion of 54 per cent found in 

79 studies (X= = 2k P < -10) Since the sample sizes 

ments conducted by the writer’s students were ‘ 

than those of the remaimng investigations, the di ercnce 

"ere probably not due to differences in statistical poner. tliough such 

differences might account for the failure of individual 

a directional i: of +128 There seems to be no ready explanation for 
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the differences m outcome, but one hypothesis, suggested by the work 
of Adler (1968), may be considered Adlers results suggested that ex- 
perimenters made particularly sensitive to the importance of “following 
scientific procedures” tended to obtain data that not only did not confirm 
their expectations but actually tended significantly to disconfirm their 
expectations Perhaps a similar phenomenon may occur among principal 
investigators Emphasis on the investigators remaimng blind to experi- 
menters’ treatment conditions may generate a sensitivity to procedures 
among student investigators that tends to reverse the directionality of 
expectancy effects Consistent with such an hypothesis would be the 
finding that among these students, a greater proportion obtain zs of 
—1 28 or less Although the total number of such negative zs is too 
small to permit strong inference, it is of interest to note that 13 per 
cent of the students’ experiments found zs that low compared to 8 per 
cent of the remaining investigations 

B. Magnitude of Expectancy Effects 
So far we have discussed the results of studies of expectancy effect 
only m terms of the zs obtained By itself such information does not 
tell us how large the effects of expectancy tend to be Given a very 
large sample size, even effects of tnvial magnitude can reach any speci 
Bed level of z We want, therefore, to have some estimates of the magni- 
tude of expectancy effects quite apart from the question of the ‘Veahty” 
of the phenomenon 

One such estimate can be obtained by computing the proportion of 
experimenters whose obtained responses have been brought into line 
with their expectations For this computation we need the mean of the 
responses obtained by each expenmenter in each of two different condi 
tions of expectation For those experiments in which each experimenter 
was given one expectation for some of his subjects and a different ex- 
pectation for other subjects, the mean difference between responses 
of the two groups of subjects is all that is needed If an experimenter 
obtamed more of the expected responses from the subjects of whom 
he expected them than from the other subjects, that expenmenter is 
counted as showing expectancy effects 
For those expenments in which expenmenters were given the same 
expectancy for all their subjects, a preliminary computation was re- 
quired For all the experimenters given one of the expectations, the 
grand mean response obtamed was computed separately for all expen 
menters given one expectation and again for all experimenters given 
the opposite expectation An expenmenter m the condition of expecting 
more X type responses was counted as showing expectancy effect if 
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his mean obtained responses showed more X than did the giand mean 
of the experimenters in the condition led to expect fewer X responses 
An experimenter in the condition of expecting fewer X type responses 
was counted as showing expectancy effect if his mean obtained responses 
showed fewer X than did the grand mean of the experimenters in the 
condition led to expect more X responses The analogous procedure 
was also employed for estimating the proportion of subjects whose re 
sponses were in the direction of their experimenter s expectancy 
Table XVII shows the results of the analyses performed There were 


TABLE XVII 

Pboportions of SimjECTS AND Expeiumenteiis Showing Expfctancy Effects 

Subjects Experimenters 


Number of Studies 
Median 2 of studies 
Number of iSs or Es (N) 

Mean N per Study 
Weighted Percent of Biased Ss 
or Ea 

Median Percent of Biased Ss 
or Es 


27 

+ 1 28 
1370 
51 

59 % 

62 % 


57 


523 

9 


69 % 


75 % 


27 studies for which the counts for subjects could be made wiA 
ate effort The selection was based not on a random sampling basis 
but rather on the basis of the availability of the data require 
were also available from two other studies but because o * 

ciated with such unusually large directional z va ues, ley '' . 

included m the analysis The mean directional s of the P 

studies employed was identical to that of all the cxpenmcn 
considered Approximately 60 per cent of subjects ga'c responses 

sistcnt With the expectation of their experimenter 

Tor the analysis based on experimenters, more of the s “ ^ P . 

the necessary information so that wc ha\e the data asc 

ments The median directional z of these expenmen , j-rcction 
l^s Uian +128 so that the sample is ^ „,,nialclj 70 

bf overreprcsentinc studies scored as 00 n'h PI thpir 

per cent of experimenters obtained data m the c tre 

How arc \\c to account for the difference in P*^P°^^'°" rffects^^ It 
'ersus proportion of experimenters affected by expo c) 

"•as possible, of course, that the difference uas in some . 
artifact of tlie difference in samples of experiments \iclding ‘PP 
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the differences m outcome, but one hypothesis, suggested by the work 
of Adler (1968), may be considered Adler’s results suggested that ex- 
perimenters made particularly sensitive to the importance of “following 
scientific procedures” tended to obtain data that not only did not confirm 
their expectations but actually tended significantly to disconfirm their 
expectations Perhaps a similar phenomenon may occur among principal 
investigators Emphasis on the investigator’s remaining blind to experi- 
menters’ treatment conditions may generate a sensitivity to procedures 
among student investigators that tends to reverse the directionahty of 
expectancy effects Consistent with such an hypothesis would be the 
finding that among these students, a greater proportion obtain zs of 
— 1 28 or less Although the total number of such negative zs is too 
small to permit strong inference, it is of interest to note that 13 per 
cent of the students’ experiments found zs that low compared to 8 per 
cent of the remaining investigations 

B. Magnitude of Expectancy Effects 
So far we have discussed the results of studies of expectancy effect 
only m terms of the zs obtained By itself such information does not 
tell us how large the effects of expectancy tend to be Given a very 
large sample size, even effects of tnvial magnitude can reach any speci- 
fied level of 2 We want, therefore, to have some estimates of the magni- 
tude of expectancy effects quite apart from the question of the “reality” 
of the phenomenon 

One such estimate can be obtamed by computing the proportion of 
experimenters whose obtained responses have been brought into line 
with their expectations For this computation we need the mean of the 
responses obtained by each expenmenter m each of two different condi- 
tions of expectation For those expenmenls in which each experimenter 
was given one expectation for some of his subjects and a different ex 
pectation for other subjects, the mean difference between responses 
of the two groups of subjects is all that is needed If an expenmenter 
obtained more of the expected resjionses from the subjects of whom 
he expected them than from the other subjects, that expenmenter is 
counted as showing expectancy effects 

For those expenments in which experimenters were given the same 
expectancy for all their subjects, a preliminary computation was re 
quired For all the experimenters given one of the expectations, the 
grand mean re>ponse obtamed was computed separately for all expen 
menters given one expectation and agam for all experimenters given 
the opposite expectation An experimenter m the condition of expecting 
more X type responses was counted as showing expectancy effect i 
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nant of behavior Fortunatel), there are two such experiments to shed 
hght on the question 

The first of these was conducted b) Burnham (1966) He hacl 23 
expenmenters each run one rat in a T maze discnmmTtion problem 
About half the rats had been lesioned bv removal of portions of the 
brain, and the remaining animals had received only sham surgery which 
involved cutting through the skull but no damage to brain tissue The 
purpose of the study was explamed to the experimenters as an attempt 
to learn the effects of lesions on discrimination learning Expectancies 
were manipulated by labelmg each rat as lesioned or nonlesioned Some 
of the really lesioned rats were labeled accurately as lesioned but some 
were falsely labeled as unlesioned Some of the really unicsioned nts 
were labeled accurately as unlesioned but some were falsely labeled 
as lesioned Table XVIII shows the standard scores of the ranks of 


TABLE XVni 


Discrimination Learning as a Function 
OF Brain Lesions and Experimenter Expectancy 



Expiectancj 



Brain state 

l/csioned Unlesioned 

2 

rof difference 

Lesioned 

Unlesioned 

2 

46 5 49 0 

48 2 58 3 

94 7 107 3 

9o 5 
106 5 

+ 1 40- 

z of Difference 

+ 1 60' 




• By unweighted means F test, z = +I 47 U test 

* Bj unweighted means F test z — +1 ^ 


performance in each of the four conditions A J*" nerform 

supenor performance Animals that had been lesionc i P 
^ well as those that had not been lesioned and anirna iM.lie\cd 

heved to be lesioned did not perform as well as those that ^^c 
^0 be unlesioned Wliat makes this experiment o speen in those 

the effects of experimenter cxpcclanc) were at cas as gre 
of actual removal of brain tissue (the z associated ui 
vvais onlv nlviiit 1 0^ 

A numb^ of tcihn.ques for the control "f, iZT o7c 

ellccts Inve been described ciscsvhcrc m detnd 
of tlicsc techniques, the cmploj-mcnt of cxpcctnncj control 
"ell illustmtcd b} Bumlnms design. TIic expenmenter expcelancx x-in 
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pnate information If the difference were not nn artifact, however, it 
would suggest that expectancy effects are relatively more widespread 
among experimenters but that the effects per experimenter are relatively 
smaller This interpretation is made more plausible by the results of 
an analysis comparing the proportion of experimenters showing expec- 
tancy effects with the proportion of subjects affected for just those experi- 
ments for which both types of information were available There were 
26 such samples, and for 24 of them the proportions of affected subjects 
and experimenters were either both above the grand median or both 
below it (z = 394) The median percentage of affected subjects was 
66 per cent, and the median percentage of affected experimenters was 
75 pel cent The latter value is identical to the percentage based on 
all 57 studies so that it seems likely that the studies for which subject 
data were available were not unrepresentative of the larger number 
of studies for which experimenter data were available Because neither 
in the case of the analysis based on subjects nor in that based on experi- 
menters was the analysis sufficiently exhaustive, nor even necessarily 
representative, we should not take these estimates as very precise Per 
haps as a crude guide to the estimation of expectancy effects and to 
the planning of the sample sizes required m future research, we can 
give as a reasonable index that about two thirds of subjects and of 
experimenters will give or obtain responses in the direction of the experi- 
menter’s expectancy 

Though we have been able to arrive at some estimate, however crude, 
of the magnitude of expectancy effects, we will not know quite how 
to assess this magnitude until we have comparative estimates from other 
areas of behavioral research Such estimates are not easy to come by 
ready made, but it seems worthwhile for us to try to obtain such esti 
mates in the future Although m individual studies, investigators occa- 
sionally give the proportion of variance accounted for by their experi- 
mental variable it is more rare that systematic reviews of bodies of 
research literature give estimates of the overall magmtude of effects 
of the variable under consideration It does not seem an unreasonable 
guess, however, to suggest that m the bulk of the experimental literature 
of the behavioral sciences, the effects of the experimental variable are 
not impressively larger,” either in the sense of magmtude of obtained 
zs or m the sense of proportion of subjects affected than the effects 
of experimenter expectancy The best support for such an assertion 
would come from experiments in which the effects of experimenter ex 
pectancy are compared directly in die same experiment, with the effects 
of some other experimental vanable believed to be a significant determi 
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nant of behavior. Fortunately, there are two such experiments to shed 
light on the question 

The first of these was conducted by Burnham (1966) He had 23 
experimenters each run one rat in a T-maze discnmmation problem 
About half the rats had been lesioned by removal of portions of the 
brain, and the remaining animals had received only sham surgery which 
involved cutting through the skull but no damage to brain tissue The 
purpose of the study was explained to the experimenters as an attempt 
to learn the effects of lesions on discnmmation leammg Expectancies 
were manipulated by labeling each rat as lesioned or nonlesioned Some 
of the really lesioned rats were labeled accurately as lesioned but some 
were falsely labeled as unlesioned Some of the really unlesioned rats 
were labeled accurately as unlesioned but some were falsely labeled 
as lesioned Table XVIII shows the standard scores of the ranks of 

TABLE XVin 

Discriminatiov Learning as a Function 
OF Brain Lesions and Experimenter Expectancy 


Brain state 


Expectancy 
Lesioned Unlesioned 


2 of difTerencc 


Lesioned 

Unlesioned 

S 

z of Difference 


46 5 
48 2 
94 7 


49 0 
58 3 
107 3 


95 5 
100 5 


+1 40 * 


+1 60 * 


‘ By uni\eightcd means F test, r — +1 47 by U test 
* By uni\eighted means F test, r = +1 65 by U test 

perfonnance m each of the four cendmons A 

supenor perfonuanee Anunals that had been ™ 

as ucll aJ those that had not been les.oncd and an.mals 

heved to be lesioned d.d not perform as well as those tliat ucrc 

to be unlesioned ^VI.at makes tins expenment of spec.al .merest .s tliat 

the effects of expenmenter expeetaixcy uerc at least ns great as tljose 

of actual rcmoxal of bram tissue (tl.e = assoc, alcd xv,t). the mteracl.on 

uasonlynbout 10). ^ control of expenmenter cxptx:tanc) 

effes^Imri^cn desc^bed ciscxxherc .n detn.l (nostnth.al 1900) One 
of fee tcelm.ques. the emplo>ancnt of ixpectanej- control groni«, ix 
xsell .Ilustmted b) Bumbams dw.gn Tl.c expenmenter cxp,-clancx- xan- 
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able IS permitted to operate orthogonally to the experimental variable 
in which tht investigator is ordinarily most interested Ten major types 
of outcomes of expectancy controlled experiments have been outlined 
and Burnhams result fits most closely that outcome labeled as Case 
3 (Rosenthal 1966 382) If an investigator interested in the effects 
of brain lesions on discnmination learning had employed only the two 
most commonly employed conditions, he could have been senously mis- 
led by his results Had he employed experimenters who believed the 
rats to be lesioned to run his lesioned rats and compared their results 
to those obtained by experimenters running unlesioned rats and believing 
them to be unlesioned, he would have greatly overestimated the effects 
on discrimination learning of bram lesions For the investigator interested 
in assessing for his own area of research the likelihood and magnitude 
of expectancy effects, there appears to be no substitute for the employ- 


ment of expectancy control groups For the investigator interested only 
in the reduction of expectancy effects, other techniques such as blind 
or minimized experimenter-subject contact or automated experimentation 
(Klemmuntz and McLean 1968, McGuigan, 1963, Miller, Bregman, and 
Norman, 1965) are among the techniques that may prove to be useful 
The first of the experiments to compare directly the effects of experi- 
menter expectancy with some other experimental variable employed ani- 
mal subjects The next such experiment to be described employed human 
subjects Cooper, Eisenberg, Robert, and Dohrenwend (1967) wanted 
to compare the effects of experimenter expectancy with the effects of 
effortful preparation for an examination on the degree of belief that 
the examination would actually take place 

Each of ten experimenters contacted ten subjects, half of the subjects 
were required to memorize a list of 16 symbols and definitions that 
were claimed to be essential to the taking of a test that had a 50 50 
c ance o being given, while the remaining subjects, the ‘low effort” 
group, were aslred only to look over the list of symbols Half of the 
experimenters were led to expect that “high effort subjects would be 
more certain of actually having to take the test, while half of the expen 
sn ers were e to expect that ‘ low effort" subjects would be more 
certain of actually having to take the test 

ha™!'’i^!^v®T of their degree of certamty of 

urha^evfr I tendency for subjects 

be takinc th '^^'’tt to beheve more strongly that they would 

XerlZtl Sftpnsing its magiutude wL the finding that 

such resDonses*^^^^ rng to obtain responses of greater certamty obtained 
ing resnonses of l'' greater degree than did experimenters expect 

mg responses of lesser certainty The ratio of expectancy effect to dfort 
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Certainty of Having to Take a Test as a Functiov of 

pREPAHATORY EfFOBT AND EXPERIMENTER EXPECTANCY 



Expectancy 



Effort level 

High 

Low 

£ 

2 of difference 

High 

Low 

+ 64 
+ 56 

- 40 

- 52 

+ 24 
+ 04 

-}-0 33» 

S 

-f-1 20 

- 92 



z of Difference 

+3 37“ 




« By F test. 


effect mean squares exceeds 112. In the terms of the discussion of e 
pectancy control groups referred to earlier, these results fit well U 
so-called case 7 (Rosenthal, 1966, 384). Had this experiment been coi 
ducted employing only the two most commonly encountered condition 
the investigators would have been even more seriously misled tha 
would have been the case in the earlier mentioned study of the effec 
of brain lesions on discrimination learning. If experimenters, while coi 
tacting high effort subjects expected them to show greater certaint 
and if experimenters, while contacting low effort subjects, expected thei 
to show less certainty, the experimental hypothesis might quite artifai 
tually have appeared to have earned strong support. The differenc 
behveen these groups might have been ascribed to effort effects whil 
actually the difference seems due almost entirely to the effects of th 
experimenters expectancy. 


MODEBATING VARIABLES 

Except for the very first few experiments in each of the research do 
mains described, the bulk of the 94 studies summarized were not dcsignci 
primarily to test the hypothesis of expectancy effects. Rather, tljcsi 
studies were designed to Icam somctliing of the conditions which in 
crease, decrease, or otherwise modiUy tlic cUccts of experimenter expec 
tanc)'. Approximately half of flic experiments (49 per cent) and half o 
the laboratories (52 per cent) oblainctl one or more interactions of ex 
pcrimcnlcr expectancy’ xrilh some other variable with an associalctl 
x>/l.2S/. Many of the specific interactions were investigated in mort 
than one experiment. 
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A. Sex of Participants 

In a great many of the experiments summarized, it would have been 
possible to examine the interaction of expectancy effects with se^i of 
expenmenter, sex of subject, and sex of dyad This was done, ho\\ewr 
or reported m onl) a fraction of the studies so that it was not possible 
o ave an exhaustive inventory of such interactions Therefore, can 
not sensibly employ the technique o£ combining zs to obtain an overal 
estimate of the interaction of experimenter expectancy with the set o 
the participants What was possible was to find those experiments » 
which a relationship was found or reported m which z reached an abso- 
liM^ ^ Summaries based on such results, then, will ate 

T ™ "'“y cstimatmg the frequency of a relationsbiP 

nstead they xvill he limited to estimating the proportion of results » 
n “tht ^ubsample of stuies m which resnte 

reached the specified value of |z| ^ , 

exuertin *■= tl'rectional zs associated with interactions » 

tion Tn tv parhcipants for studies of person J 

showed e 1®"' z means that male expert®;^''’ 

mor* thfr*"*' ertpectancy effects than did female expenmenters 
the lnter^^f^h^ ^ associated with a single Z d ” 1 ®^- of 

all tour zs based on the combmed samples The fin mS 

effects are suggests that when differences m 

menters tend t ctween male and female experimenters, ma e ^ 

to note that all ° greater expectancy effect It « ,,,5^ 

that all SIX of the studies listed m this first column were tabub'r 

table XX 

OF I’^rcTANcn Eptects as a Function of Sex 
arttcipants in Studies of Person PebceptiO'^ 
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as showing direcbonal « <+128 Though ,t would be diificult to 
attach an exact p value to this result, the fact that such consistent 
results ot tests of interactions were obtained from studies showing no 
mam ettects of experimenter expectancy, puts an additional stram on 
the credibility of the null hypothesis that expectancy effects do not 


AH the interactions shown in Table XX were based on the person 
perception task The reason for this was that for no other task was 
ere niore than a single study available to shed light on the nature 
o e interaction of expectancy effects and sex of experimenter and/or 
su ject Another experiment testing the difference between male and 
e experimenters in magnitude of expectancy effect was available 
at was the study by Raffetto ( 1967, 1968 ) of reports of hallucmatory 
experiences At z = — 1 65 he found that for this task it was female 
experimenters who showed the greater expectancy effects It seems possi 
^ at whether male or female expenmenters show the greater ex- 
*^ay depend upon the specific nature of the expenment 

In the second column of Table XX a positive z means that female 
u jects were more susceptible to the effects of expenmenter expectancy 
^ consistency to these results and perhaps all that can be 

sai IS that sometimes male subjects and sometimes female subjects 
s ow greater susceptibihty to expectancy effects Of the seven studies 
represented m Column II, five were tabulated earlier as showing direc- 
T 1? < +1 28 In his expenment employing a marble droppmg task, 

Jo nson (1967) had found female subjects to be more suscephble than 
e subjects to expectancy effects (z = 4-1.72) at least under some 

conditions ^ J 

^*rd column of Table XX we find the results of a highly 
c three way interaction between sex of subject, sex of expen 
enter, and expenmenter expectancy The first of these studies (38) 
na net positive expectancy effects among male expenmenters contact 
ng either male or female subjects and among female expenmenters 
t female subjects However, when female expenmenters con- 

e male subjects, the expectancy effect was reversed, with subjects 
osponding m the direction opposite to that which expenmenters had 
cen ed to expect Just that same pattern was obtained in two other 
yses Of the seven studies represented in column III, all but one 
^tabulated earlier as showing diectional zs < -f- 1 28 

ree other studies have reported mteractions involnng simulte 
eous y the sex of expenmenter and subject and magnitude of cxpcc- 
^cy effect Johnson (1967), m his marble-dropping expenment, found 
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A. Sex of Participants 

In a great many of the experiments summarized, it would have been 
possible to examine the interaction of expectancy effects with sex of 
experimenter, sex of subject, and sex of dyad This was done, however, 
or reported in only a fraction of the studies so that it was not possible 
to have an exhaustive inventory of such mteractions Therefore, we can 
not sensibly employ the technique of combining zs to obtain an overall 
estimate of the interaction of experimenter expectancy with the sex of 
the participants What was possible was to find those expenments m 
which a relationship was found or reported in which z reached an abso 
lute value of 1 28 Summaries based on such results, then, will have 
little to offer m the way of estimating the frequency of a relationship 
Instead they will be limited to estimating the proportion of results in 
a specific direction for just that subsample of studies in which results 
reached the specified value of \z\ 

Table XX shows the directional zs associated with interactions of 
expectancy effects with sex of participants for studies of person percep- 
tion In the first column, a positive s means that male expenmenters 
showed greater expectancy effects than did female expenmenters When 
more than a single study is associated with a single z it means that 
the interaction was based on the combmed samples The findmg of 
all four ss as positive suggests that when differences m expectancy 
effects are found between male and female experimenters, male experi- 
menters tend to show the greater expectancy effect It is interesting 
to note that all six of the studies listed in this first column were tabulated 


TABLE XX 

ExpECTA^CY Effects as a Function of Sex 
OF Pahticipants in Studies or Person Perception 


I 

Sex of experimenter 

II 

Sex of subject 

III 

Sex of dyad 

Studj 

2 

Study 

z 

Study 

2 

14*, r, 

+ l 44 

14, 15 

-1 44 

6-10 

+1 51 

22, 23 

+ 1 41 

18 

-2 85 

38 

+1 64 

39 

+ 1 90 

30 

+2 58 

39 

+ 1 51 

43» 

+2 07 

40,41 

+1 9G 





42 

+ 1 64 




• Numbers refer to those of Table XI 

» Ilcfers to expectancy effects transmitted xia research assistants, 
see Rosenthal, 19GC, 232 


INTERPERSONAL EXPECTATIONS 


243 


of the experimenter, the real subjects subsequently contacted were 
affected by a change in the expenmenter’s behavior also to disconfirm 
his experimental hypothesis It seems possible, then, that the results 
of behavioral research can, by virtue of the early data returns, be deter- 
mined partially by the performance of just the first few subjects (Rosen- 
thal, 1966) 

In some of the experiments conducted, it was found that when experi- 
menters were offered a too-large and a too obvious incentive to affect 
the results of their research, the effects of expectancy tended to dimmish 
It speaks well for the integrity of student experimenters that when they 
felt bribed to obtain the data they had been led to expect, they seemed 
actively to oppose the principal mvestigators There was a tendency 
for those experimenters to ‘bend over backward’ to avoid the biasmg 
effects of their expectation, but sometimes with their bending so far 
backward that the results of their experiments tended to be significantly 
opposite to the results they had been led to expect (Rosenthal, 1966) 

In several experiments in which each expenmenter was given two 
different expectancies for two allegedly different subsamples of subjects, 
the distribution of expectancy effects showed a significant and interesting 
skew In each of three such studies, which were not at all homogeneous 
in the overall magnitude of expectancy effects obtained, a significant 
mmonty of experimenters obtained results more negative in direction 
than could reasonably be expected by chance These three studies are 
summarized in Table XXI in which the first listed study employed animal 
subjects and the others employed human subjects performing the photo 
latmg task 

Since each experimenter had contacted some subjects under different 
conditions of expectation, magnitude of expectancy effect was defined 
simply as the mean response obtained under one condition of expecta- 
tion minus the mean response obtained under the opposite condition 
of expectation In order to make the units of measurement of the differ- 
ent studies more comparable, each distribution of difference scores was 
divided into ten equal intervals, fi\c above an absolute difference score 
of 00 and five below All three studies show a substantial minority 
(14 to 20 per cent) of cxpcnmcnlcrs to obtain data significantly opposite 
to what they had been led to expect Tliis type of finding suggests 
the possibility' that there are some cxpcnmcntcrs who react to being 
given an expectancy' citlicr by bending oxer backward to axoid biasing 
their data, or perhaps because of resentment at being told what to ex- 
pect, by m some way showing tlic expectancy’ inducer lliat he was wrong 
to make tlic prediction lie made If these minority reactions to induced 
expectancies were widespread, it might be of interest to try to Icam 
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that when experimenter and subject were of the same sex there were 
greater expectancy effects than when experimenter and subject were 
of the opposite sex (s = -J-l SO) Just the opposite results, however, 
were obtained by Adair (1968) employing a numerosity estimation task 
(z = —2 33) and by Silverman (1968) employing a reaction time mea- 
sure (z = —161) Both these investigators found greater expectancy 
effects when experimenters and subjects were of the opposite sex The 
joint effects of experimenter and subject sex may sometimes be significant 
determinants of the direction and magnitude of expectancy effects, but 
it seems likely that the type of task employed may be a further compli- 
cating variable 

B Experimenter Dominance 

On the basis of a variety of evidence presented elsewhere ( Rosenthal, 
1966), it was suggested that expenmenters showing greater dominance 
or a greater degree of professionalness m their behavior were likely 
to show greater effects of their experimental hypotheses This interaction 
of a specific experimenter characteristic with magnitude of expectancy 
effect has recently received some fairly strong support m three experi- 
ments conducted by Bootzin, 1968 In all three studies, Bootzin found 
more dominant experimenters to show greater effects of their induced 
expectations The three obtained zs were +2 05, -f 3 30, and -f-2 17, 
the combined z was -^4 35, p < 000008 
This result may well be related to the finding that where there are 
differences between male and female expenmenters in magnitude of 
expectancy effects, it is the male expenmenters who are likely to show 
the greater effects It seems reasonable to suppose that, in general, male 
experimenters are likely to be classed as more dominant than are female 
expenmenters 

C Other Variables 

Tlierc arc a good many other vanables that have been shown to inter- 
act significantly x\ith the effects of experimenter expectancy Later, we 
shall ha\c occasion to refer to some, but because so many of these 
interactions base been desenbed elsewhere in some detail (Rosenthal, 
1966) we need give here only some illustrations 

Tlirough the employment of accomplices serving as the first few sub- 
jects IS was learned that when the responses of the first few subjects 
confirmed the expenmenters hypothesis, his boha\ior toward his subse- 
quent subjects was affected m such a way that these subjects tended 
to confirm further the expenmenters hypothesis When accomplices serv- 
mg as the first fexv subjects mtcntionally disconfirmcd the expectation 
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directional zs associated with the mam effects of experimenter expec 
tancy were less than +1 28 and, dierefore, were recorded as zs of 00 
Earlier, reference was made to the expenment by Adler (1968) in 
which the set given the expenmenters was an important determmant 
of die direction of the subsequent expectancy effects Other such results 
have also been reported (Rosenthal, 1966, Rosenthal and Persmger, 
1968) as have results showmg the effects of subject set on the direction 
and magnitude of expectancy effects (Rosendial, 1966, White, 1962) 

In a number of studies where there was a conflict behveen what 
an expenmenter had been led to expect and what he himself actually 
expected, these two sources of hypothesis were found to interact signifi 
candy (Bootzin, 1968, Nichols, 1967, Strauss, 1968), but sometimes they 
did not (Marcia, 1961, Marwit & Marcia, 1967) 

For two samples of male expenmenters, it has been reported that 
those who exchanged fewer glances with their subjects dunng the m 
struchon reading phase of the person perception expenment, subse 
quendy showed greater expectancy effects (Rosenthal, 1966, 268) The 
more recent work of Connors (1968) bears out this findmg (z » +2 12) 
Other studies of vanables complicating the effects of experimenter 
expectancy have investigated the effects of expenmenter and subject 
need for approval, expenmenter and subject anxiety, degree of ac- 
quaintanceship between expenmenter and subject, expenmenter status, 
and characteristics of the laboratory in which the interaction occurs 
In general, the results of these studies have been complex, with far 
too many results of large zs, but with the signs sometimes in one direc 
and somebmes in die o\heT For many nf these mederabng variables 
there appear to be meta moderatmg vanables (Rosenthal, 1966) 


Vn. THE MEDIATION OF EXPECTANCY EFFECTS 

How are we to account for the results of the expenments desenbed? 
How does an expenmenter unintentionally inform his subjects just what 
response is expected of him? Our purpose in this section is to rcMCu 
the evidence that may shed light on this question First, howcicr, we 
must tike up the proposition that there is nothing to be explained, 
that our talk about an artifact is based on nothing but other artifacts 

A. Expectancy Effects as Arbfacls 

Cheating and recording errors ha\c been suggested as pnme candi- 
dates for consideration ns the artifacts leading to the false conclusion 
that experimenters' expectancies ma^ sene as significant partial dclcrmf- 
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TABLE XXI 

Proportions of Expeiumentebs Showing Various 
Magnitudes of Expectancy Effects in Three Studies 


Effect 


Study 


Combined 

I A H, 1966, n 

R,P,M,V,G, 
1964b I 

R,P,M,V,G, 
1964b II 


(N = 15) 

(N = 13) 

(N =7) 

II 

+5 

00 

00 

00 

00 

+4 

00 

00 

00 

00 

+3 

13 

08 

00 

09 

+2 

40 

31 

00 

29 

+ 1 

27 

23 

43 

29 


00 

23 

43 

17 


00 

00 

00 

00 


00 

00 

00 

00 

-4 

13 

08 

00 

.09 

—5 

07 

08 

14 

09 

2* 

+2 58 

+2 33 

+2 58 

+4 33 

• Includes 00 effect 


* For Asymmetry 

raenibership in this subset of experimenters 
uho mact to mduced expectat.ons w.th such negat.ve and nTn-Gaussian 

wlxvsT™-; - 

no expectation obtained more variable resDonseri, vj^ ® 
ppectmg ether high or Io,v rates of con^drnin^ S'^s" m" 
larly, m an experiment on ludcmir tlie fmnnonrx, c ^ i 
and T,maeus 0967) found iL »ntml ^grourexpe^enl uf 
more vanable responses than did experimente'L. expecting ekt over 
esumafon or under-esUmaMon (x = +1.88). I„ both of®thle exper . 
ments. as was often the case in studies showing interaction effects the 
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the effects of experimenter expectancy were greater in the study with 
better controls for observer errors and cheating In the less well-con- 
trolled study, experimenters expecting more responses obtained an aver- 
age of 73 per cent more responses than did the experimenters expecting 
fewer responses In the better controlled study, however, experimenters 
expecting more responses obtained an average of 211 per cent more 
responses than did the experimenters expecting fewer responses 

The experiment by Persinger, Knutson, and Rosentibal (1968) was 
filmed and tape-recorded without the knowledge of experimenters or 
subjects Independent observers then recorded subjects’ responses di- 
rectly from the tape recordmgs and these recordings were compared 
to those of the original experimenters It was found that 72 per cent 
of the experimenters’ transcriptions were m error and that 48 per cent 
of the transcriptions erred in the direction of the experimenters’ hypothe- 
sis while 24 per cent of the transcnphons erred in the direction opposite 
to that of the experimenters’ hypothesis These latter errors, however, 
tended to be larger than the errors favoring the hypothesis, so that 
the mean net error per experimenter was — 0003 m the direction oppo- 
site to the expenmenters’ expectancies and so trmal in magnitude that 
analyses based on either the corrected or uncorrected transcriptions gave 
the same results (directional z of +1 64) 

Analysis of the films of this and of other experiments (Rosenthal, 
1966) in which expenmenters did not know they were being filmed, 
gave no evidence to suggest any attempts to cheat on the part of the 
experimenters Similarly, other analyses of the incidence of recording 
errors show their rates to be too low to account for the results of studies 
of expenmenter expectancy or most other studies for that matter It 
IS, of course, possible that in any smgle expenment in the behavioral 
sciences, cheating or recording errors may occur to a sufficient extent 
to account for the obtained results It seems unlikely, however, that 
any replicated findmgs of the behavioral sciences, especially if replicated 
in different laboratories, could reasonably be ascribed either to inten- 
tional errors or to recording errors 

Our discussion has been of cheating and of observer errors serving 
as artifacts m the production of an effect which can itself be regarded 
as an artifact in behavioral research, the expectancy of the experimenter 
Our discussion would be incomplete, however, without a systematic 
consideration of what it would mean if we had found effects of experi- 
menter expectancy to be associated xvith artifacts of cheating and of 
obserxer errors Earlier discussions of this problem have, unfortunately, 
been incomplete m this regard (Barber and Silver, 1968, Rosenthal, 
1964a) Barber and Silver, for example, suggest that if it could bo cstab- 
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nants of subjects’ responses (Barber and Silver, 1968, Rosenthal, 1964a) 
There is no way to rule out with any certainty the operation of either 
intentional “errors” or errors of observation in most of the individual 
experiments investigating the effects of experimenter expectancy — ^but 
there is no way to rule out the operation of these errors in the vast 
majonty of the research in the behavioral sciences What we can do 
IS to rule out the operation of cheating and observer errors as necessary 
factors operating in studies of expectancy effects There are a number 
of experiments which do permit us to rule out the operation of such 
errors 

Earlier, the expenment by Adair and Epstein (1967, II) was de- 
scribed It will be recalled that in this study there were no experimenters, 
only tape recordings of the voices of experimenters, and tape recordings 
cannot err either intentionally or unintentionally In this experiment, 
in which subjects recorded their own responses, the directional z asso 
ciated with expectancy effects was -f-l 64 

The expenment by Johnson (1967) similarly ruled out the operation 
of intentional or observer errors The recording of subjects’ responses 
was accomplished by an electrical system which did the bookkeeping 
The tallies were then transcribed by the principal investigator who was 
blind to the experimental condition of experimenter expectancy m which 
each subject had been contacted Despite the tightness of the controls 
for cheating and for observer errors, Johnson’s results showed a very 
large effect of experimenter expectancy with a directional z of 4*3 89 

The expenment by Weick (desenbed m Rosenthal, 1966) was another 
in which cheating and observer errors were unlikely to occur That 
expenment was conducted m a classroom under the watchful eyes of 
students in a class in experimental social psychology Despite the re- 
straint such an audience might be presumed fo impose on the intentional 
errors or the careless errors of an expenmenter, the obtained directional 
z was 4-2 33 

Because of the small size of the aiumals involved, experiments employ- 
ing planana would seem to be especially prone to quasi-intentional errors 
or to errors of recording Smee it is often difficult to judge the behavior 
of planana, expenmenters might too often judge or claim a response 
to have occurred \\ hen that response was expected Hartry ( 1966 ) con 
ducted tiix) experiments on the effects of expenmenter expectancy on 
the results of studies of planana performance In one of these studies, 
special pains were taken to reduce the likelihood of observer or inten- 
tional errors Expenmenters >\ere given more intensive training, an in- 
structor was present dunng the conduct of the expenment, and three 
obserx ers \\ ere present to record the worm’s response Quite surprisingly, 
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the effects of experimenter expectancy were greater m the study with 
better controls for observer errors and cheating In the less well con 
trolled study, experimenters expecbng more responses obtained an aver 
age of 73 per cent more responses than did the experimenters expecting 
fewer responses In the better controlled study, however, experimenters 
expectmg more responses obtained an average of 211 per cent more 
responses than did the experimenters expecting fewer responses 
The experiment by Persinger, Knutson, and RosenAal (1968) was 
filmed and tape recorded without the knowledge of expenmenters or 
subjects Independent observers then recorded subjects’ responses di 
rectly from the tape recordings and these recordings were compared 
to those of the onginal expenmenters It was found that 72 per cent 
of the expenmenters’ transcriptions were in error and that 48 per cent 
of the transcriptions erred m the direction of the experimenters’ hypothe 
SIS while 24 per cent of the transcnphons erred m the direction opposite 
to that of the experimenters hypothesis These latter errors, however, 
tended to be larger than the errors favormg the hypothesis, so that 
the mean net error per expenmenter was — 0003 in the direction oppo 
Site to the experimenters’ expectanci^ and so trivial m magnitude that 
analyses based on either the corrected or uncorrected transcriptions gave 
the same results (directional z of +1 64) 

Analysis of the films of this and of other experiments (Rosenthal, 
1966) m which experimenters did not know they were being filmed, 
gave no evidence to suggest any attempts to cheat on the part of the 
expenmenters Similarly, other analyses of the incidence of recording 
errors show their rates to be too low to account for the results of studies 
of expenmenter expectancy or most other studies for that matter It 
IS, of course, possible that in any smgle expenment in the behavioral 
sciences, cheating or recordmg errors may occur to a sufficient extent 
to account for tiie obtained results It seems unlikely, however, that 
any replicated findings of the behavioral sciences, especially if rephcated 
in different laboratories, could reasonably be ascribed either to inten 
tional errors or to recording errors 

Our discussion has been of cheating and of observer errors serving 
as artifacts in the production of an effect which can itself be regarded 
as an artifact m behavioral research, the expectancy of the expenmenter 
Our discussion would be incomplete, however, without a systematic 
consideration of what it would mean if we had found effects of expen- 
menter expectancy to be associated with artifacts of cheating and of 
observer errors Earlier discussions of this problem ha%c, unfortunately, 
been incomplete m this regard (Barber and Silver, 196S, Rosenthal 
1964a) Barber and Silver, for example, suggest that if it could be cslab 
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lished that such meta artifacts as cheating and observer errors accounted 
for the results of studies showing eiqpectancy effects at some specified 
level of z, then this would be sufficient to rule out the effects of expen- 
menter expectancy as a source of artifact in other research Unfortu- 
nately, the situation is a good deal more complex than that simple infer- 
ence would suggest 

Table XXII presents a schema for the consideration of a variety of 
expenmental outcomes in relation to the artifact of expectancy effects 
and the meta-artifacts of intentional and recording errors We let the 
“primary variable” stand for whatever a given behavioral researcher 

TABLE XXII 

Schema foh the Consideration of Experimental Results 
As A Function of Artifacts and Meta-Artifacts 


Effects of meta-artifact 


Experimental results 

Decrease z 

Trivial effect 

Increase z 

PRIMARY VARIABLE 

Positive z 

Case 1 

Case 2 

Case 8 

Trivial z 

Case 4 

Case 5 

Case 6 

Negative z 

Case 7 

Case 8 

Case 9 

BXPPCTANCV effect 

Positive 2 

Case 10« 

Case 11 

Case 12 

Trivial 2 

Case 13 

Case 14 

Case 15 

Negative z 

Case 16 

Case 17 

Case 18 


“ The best documented case of cheating among experimenters to come to our atten- 
tion occurred m research involving animal subjects in which allegedly dull animals 
were helped to perform better, thus decreasing the effects of experimenter expectancy 


IS currently investigating, other than expectancy effects The three 
columns of Table XXII represent the three broad classes of effects of 
clieating or recording errors (a) effects decreasing the obtained z, (b) 
effects of arbilranly trivial magnitude, and (c) effects increasing the 
obtained z The suggestion by Barber and Silver was essentially to look 
only at tlie cell labeled Case 12 If, m an experiment on expectancy 
effects, there were errors to inflate the a, then we need not concern 
ourselves any longer with the role of expectancy effects an an artifact 
m bchTMoral research The conclusion, of course, does not follow We 
want to consider the rest of the possible outcomes 
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Case 16 to be sure that a negative z for expectancy effects was not 
due to the meta artifact, or about Case 13 to be sure that a near zero 
z was not depressed from a positive z by cheatmg or recordmg errors 
Although our empirical evidence suggests most errors due to cheatmg 
or misrecordmg to be trivial, it must be kept m mmd that these errors 
can cut two ways They can artifactually deflate the obtained effects 
as much as they can artifactually inflate Ae obtamed effects 

What makes our schema still more complicated is the necessity for 
considering simultaneously the effect of our meta artifact on expectancy 
effects relative to its effect on the primary variable There is no basis 
m data to think so, but if we assume for the moment that Case 12 
effects were found, we would want to compare their magnitude and 
frequency with those of the Case 3 effects If it were found that Case 
12 occurs often but Case 3 occurs seldom, then we would legitimately 
begin to wonder whether expectancy effect research might not be par- 
ticularly prone to meta artifact But our inquiry would be far from over 
since it must first be seen whether Cases 10, 13, and 16 are not also 
over represented relative to Cases 1, 4, and 7 What we want, m short, 
IS something like a 3X^X2 contingency table that would permit us 
to say somethmg of the effect of our meta artifact on experimenter ex- 
pectancy and of the relative effects of the meta artifact and of expen 
menter expectancy on the pnmary vanable 

B Operant Conditioning 

If intentional errors and recording errors will not do as explanations 
of the results of studies of expectancy effect, what will? The most obvious 
hypothesis was that experimenters might quite unwittmgly reinforce 
those responses of their subject that were consistent with their hypothe 
SIS Any small reinforcer might serve — a smile, a glance, a nod Under 
an hypothesis of operant conditioning, we would expect to find that 
the very first response of a given subject is not affected by the experi- 
menters expectancy and that, in general, later responses are more 
affected than earlier responses 

Elsewhere, there is a summary of four experiments showing that, on 
the average, expectancy effects are greater for the subjects very first 
response Aan for his later responses (Rosenthal, 1966, 2S9-293) A more 
recent expenment by Wcssler (1966) also showed a decrease in expec- 
tancy effect from the subjects' earlier to later responses (z = -f-1 65) 

The expenment by Adair and Epstem (1967), in which tape record- 
ings served as experimenters, also served to rule out the operation of 
operant conditionmg as a neccssnry mediator of expectanc) effects Addi- 
tional, though “softer," evidence that opcnint conditioning wus not a 
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lished that such meta-artifacts as cheating and observer errors accounted 
for the results of studies showing expectancy effects at some specified 
level of z, then this would be sufficient to rule out the effects of experi- 
menter expectancy as a source of artifact m other research Unfortu- 
nately, the situabon is a good deal more complex than that simple infer- 
ence would suggest 

Table XXII presents a schema for the consideration of a variety of 
experimental outcomes in relation to the artifact of expectancy effects 
and the meta-artifacts of intentional and recording erroi^ We let the 
“primary variable” stand for whatever a given behavioral researcher 

TABLE XXII 

Schema for the Consideration of Experimental Results 
As A Function of Artifacts and Mfta Artifacts 


Effects of meta artifact 


Experimental results 

Decrease z 

Trivial effect 

Increase z 

PRIMARY VARIABLF 

Positive 2 

Case 1 

Case 2 

Case 3 

Trivial z 

Case 4 

Case 5 

Case 6 

Negative z 

Case 7 

Case 8 

Case 9 

EXPECTANCY EFFFCT 

Positive 2 

Case 10« 

Case 11 

Case 12 

Trivial z 

Case 13 

Case 14 

Case 15 

Negative z 

Case 16 

Case 17 

Case 18 


* The best documented case of cheating among experimenters to come to our atten- 
tion occurred in research involving animal subjects in which allegedly dull animals 
w ere helped to perform better, thus decreasing the effects of experimenter expectancy 


is currently investigating, other than expectancy effects The three 
columns of Table XXII represent the three broad classes of effects of 
cheating or recordmg errors (a) effects decreasing the obtained z, (b) 
effects of arbitranly trivial magnitude, and (c) effects increasing the 
obtained z The suggestion by Barber and Silver was essentially to look 
only at the cell labeled Case 12 If, m an experiment on expectancy 
effects, there were errors to mfiatc the z, then we need not concern 
ourselves any longer with the role of expectancy effects an an artifact 
in behavioral research The conclusion, of course, does not follow Wc 
want to consider the rest of the possible outcomes 

In Case 15, for example, wc have a tnvial z for expectancy effect 
\vhich may have been made trivial by the meta artifact which increased 
the z from a negative to a near zero level We want to know about 
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effects That an hypnotist-expenmenters expectancy may affect his treat- 
ment of a research subject has been documented earlier, though the 
sample sizes involved only a single hypnotist-expenmenter and a single 
subject (Shorand Schatz, 1960) 

The hvo expenments described suggest that auditory cues may be 
sufficient to serve as mediators of expectancy effects There are hvo addi- 
tional expenments in support of this proposibon, both of which have 
the additional merit of permitting estimates of the effects on the magni- 
tude of expectancy effects of subjects* havmg available only auditory 
cues as compared to having access to both auditory and visual cues 
The possibiUty of obtainmg such estimates depends on having available 
at least three groups of experimenters For two of these groups, subjects 
must have access to both visual and auditory cues from their expen- 
menters, but each group of expenmenters must have a different expecta 
tion for their subjects’ responses The difference between the mean re 
sponse obtained by expenmenters of these two groups is considered 
the base line of magnitude of expectancy effect when both channels 
of mformation are available The liurd group of expenmenters is given 
one of the two possible expectations, but subjects' access to visual cues 
from these expenmenters is cut off The difference between the mean 
response obtained by expenmenters m this condition and the mean re 
sponse obtained by experimenters expecting the opposite response is 
considered the magnitude of expectancy effect when only auditory cues 
are available This magmtude can be divided by the base line magnitude 
for an estimate of the proportion of expectancy effect obtained when 
only auditory cues were available 

The two expenments meeting these requirements have been tabulated 
earlier as Rosenthal and Fode, 1963b, II and as Zoble, 1968 (the former 
study was a master's thesis by Fode) Fode’s study employed the person 
perception task and his data showed that 47 per cent of the total ex 
pectancy effect was obtained when subjects had access only to auditory 
cues from their experimenter Zoble's study employed a task requuing 
subjects to make tone length discnminations but his results were re 
markably similar to Fode’s Zoble's data showed that 53 per cent of 
the total expectancy effect was obtained when subjects were restncted 
to purely auditory cues The combined z associated with findmg expec 
tancy effects with only auditory cues available to subjects was 01 

Additional evidence for the importance of the auditory channel to 
the mediafaon of expectancy effects comes from an analysis by Duncan 
and Rosenthal (1968) Sound motion pictures were available of three 
male expenmenters admimstenng the person perception tasl^ to 10 differ- 
ent subjects An analysis of the expenmenters’ vocal emphases showed 
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factor m the mediation of expectancy effects has been presented by 
Marvvit (1968) and Masling (1965, 1966), though Marwit and Marcias 
(1967) data suggested that sometimes operant condibonmg might be 
a factor 

Just as was the case m our consideration of cheating and recording 
errors as explanations of expectancy effect, we cannot conclude that 
operant conditioning never operates as a mechanism mediating expec- 
tancy effects What we can conclude, just as in the case of cheatmg 
and recording errors, is that expectancy effects do occur in the absence 
of operant conditionmg Operant conditioning, like cheating and ob 
server errors, cannot explam die results of studies of expectancy effect 

C. Communication Channels 

The fact that the very first response of an experimental subject can 
be affected by the expectancy of the experimenter suggests that the 
mediation of expectancy effects must occur, at least sometimes, during 
that phase of the data-ooUection situation in which the experimenter 
greets, seats, and instructs his subject Some beginnings have been made 
to learn what the experimenter does unintentionally during this phase 
of the experiment to mform his subject of the expected response These 
beginnings are not characterized by spectacular success (Rosenthal, 
1966) Data of a more modest sort, however, are beginning to sketch 
some picture of the classes of cues likely to be involved in the mediation 
of expectancy effects 

There are two experiments to show that auditory cues alone may 
be sufficient to mediate expectancy effects One of these is the study 
by Adair and Epstein ( 1967 ) in which subjects heard only the instruc- 
tions tape-recorded earlier by expenmenlers given different expectancies 
The z for expectancy effect based upon voice alone was -f-l 64 The 
other experiment was by Troffer and Tart (1964) in which the experi- 
menters were all expenenced hypnotists They were to read standard 
passages to subjects m each of two conditions which may have affected 
the expectation of the experimenters When experimenters had reason 
to expect lower suggestibility scores, their voices were found to be sig- 
nificantly Jess convincing in their reading of the instructions to their 
subjects ( z as H-2 81 ) This result was obtained despite the fact that ex- 
perimenters (a) were cautioned to treat their subjects identically, (b) 
were told that their performances would be tape-recorded and (c) were 
all aware of the problem of expenmenter effects This experiment tells 
us of the importance of the auditory cues, but because there was a 
plausible rival hypothesis to the hypothesis of expectancy effects the 
study '' as not mcluded m our earher summary of studies of expectancy 
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With all the data available to suggest the importance of the auditory 
channel in the mediation of expectancy effect, it should not surprise 
us that those studies of expectancy effect permitting the subject little 
or no auditory access to the experimenter generally failed to obtain ex- 
pectancy effects. That was the case in die two studies listed for Carlson 
and Hergenhahn (1968) and in that listed for Moffatt (1966), all three 
of these studies having been tabulated as showing directional zs of less 
than +1.28. The same result occurred in one group of experimenters 
in the study conducted by Fode (Rosenffial and Fode, 1963b, II) though 
in that study the overall effects of experimenter expectancy were still 
associated with a z > 3.00. 

So far we have focused on the auditory channel of communication 
but there are also data available to show the importance of the visual 
channel. One important finding comes from the research by Zoble ( 1968) 
described earlier. As one of his many experimental groups, Zoble had 
one group of subjects who had access only to visual cues from their 
experimenter. Despite the fact that Zoble’s results helped to support 
the importance of auditory cues, his data nevertheless showed that visual 
cues were more effective than auditory cues in the mediation of expec- 
tancy effects (z = +1.44). Whereas those subjects who had access only 
to auditory cues were affected by their experimenter’s expectancy only 
53 per cent as much as those subjects who had access to both visual 
and auditory cues, those subjects who had access only to visual cues 
were affected by their experimenters expectancy 75 per cent as much 
as those subjects who had access to both information channels. Zoble’s 
results suggest a possible nonadditivity of the information carried in 
the visual and auditory channel. It may be that, when subjects are de- 
prived of either visual or auditory information, they focus more attention 
on the channel that is available to them. This greater attention and 
perhaps greater effort may enable subjects to extract more information 
from the single channel than they could, or would, from that same 
channel if it were only one part of a two-channel information input 
system. 

Much earlier we tabulated the results of hv’o studies of verbal condi- 
tioning by Kennedy’s group. In one of those studies, Kennedy, Edwards, 
and Winstead (1968) found the overall directional z associated with 
expectancy effects to be less than +1.28. That experiment wc count 
as a directional z of .00 in our bookkeeping system but for our present 
purpose we can afford a closer look at that study. Part of the time 
experimenters were facc-to-facc with tlicir subjects, and part of the time 
subjects had no visual access to tlicir experimenter. The failure of the 
overall directional z to reach +1.28 seems due entirely to t!ic condition 
in which subjects were deprived of wual cues from their experimenter. 
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that no subject was exposed to identical differential emphases of those 
portions of the instructions that listed the subjects response alternatives 
All five subjects who heard relafavely greater vocal emphasis on the 
response alternatives associated with high photo ratings subsequently 
assigned higher photo ratings than did any of the five subjects who 
heard relatively greater vocal emphasis on the response alternatives asso- 
ciated with low photo ratings (z = 4-265) 

The three experimenters on whose differential vocal emphases these 
paralmguislic analyses were made had been selected because they were 
known to have shown expectancy effects We expect, therefore, by 
definition, to find a large correlation between the various expectancies 
given each experimenter and the mean photo rating given by the subject 
contacted under each different expectation That correlation was -|- 60 
{z = +1 75) a finding obviously not reported m support of the hypothe- 
sis of expectancy effect, but rather to establish a base line for comparison 
The correlation between an experimenter’s differential vocal emphasis 
on the vanous response alternatives in the instructions read to subjects 
and the subjects’ subsequent response was 4-72 (2 = 4-233) That 
was a promismg chain of correlations The experimenter’s expectancy 
predicted his subjects’ responses and the differential vocal emphasis of 
the experimenter also predicted his subjects’ responses It remained only 
to show that the experimenter’s expectancy was a good predictor of 
how he read his instructions to his subjects Then everydimg would 
fall nicely into place Unfortunately, that is not what we found The 
correlation between an experimenter’s expectancy and his instruction- 
reading behavior was only 4- 24, a correlation that is difficult to defend 
as being really different from zero with a maximum of eight degrees 
of freedom The correlation between experimenters’ differential vocal 
emphases and their subjects’ subsequent photo ratings with the effects 
of experimenter expectancy partialed out showed no shrinkage, it was 
-f 74 Therefore, though this analysis gave further exidence of the im- 
portance of the auditory channel of communication, it did not turn 
out to provide the key to the specific signal employed by subjects to 
learn what it was that their experimenter expected Evidence for such 
a signal would have been provided only if the correlation between an 
experimenter’s expectation and his differential vocal emphasis during 
instruction reading had been substantial * 

• Rosenberg m collaboration with Duncan, has recently replicated the effects 
on subjects responses of differential emphasis in the instrucbon reader’s bsting of 
response alternatives That research was based, of course, on a different simple 
of experimenters For a more detailed discussion of the interaction between diffcren- 
bal vocal emphasis and subjects’ evaluation apprehension, see the chapter bj Rosen- 
berg in this volume 
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his cueing behavior, may begin his experiment witli little ability to exert 
expectancy effects But all the time, in his interaction with his first sub- 
ject, he IS emitting a myriad of unprogrammed and unintended cues 
in the visual and auditory channels If whatever pattern of cues he 
is emitting happens to affect the subject's response, so that the experi- 
menter obtains the response he expects to obtain, that pattern of cues 
may be more likely to recur with the next subject In short, obtaining 
an expected response may be the reinforcement required to shape the 
experimenter’s pattern of unintentional cueing Subjects, then, may teach 
expenmenters how to behave kmesically and paralinguistically so as to 
increase the likelihood that the next subject’s response will be more 
in the direction of the experimenter's expectancy Our old friend Pfungst, 
the student of Clever Hans, found that as experimenter-questioners 
gained experience in questioning Hans, they became better unintentional 
signalers to Hans 

If we are seriously to entertain the proposition that expectancy effects 
are learned in an interpersonal context, then we must be able to show 
that, in fact, expenmenters are more successful in their unintentional 
influencing of subjects later, rather than earlier, in the sequence of sub- 
jects contacted Elsewhere there is a report of six analyses investigating 
this question In three of the samples studied, subjects contacted later 
m the series showed greater effects (z > -^1 28) of expenmenter expec- 
tancy, while three of the samples showed no order effect (Rosenthal, 
1966) The overall directional z in support of the learning hypothesis 
was -\~2 73 (2 z »= -f-6 70, N « 6) Since that earlier summary a number 
of other relevant findings have become available 

Connors and Horst (1966), m whose research the overall magnitude 
of expectancy effect did not reach a directional z of -f-1 28, nevertheless 
found that later contacted subjects showed significantly (z = +181) 
greater exepctancy effects than did earlier contacted subjects That same 
result was obtained by Uno, Frager, and Rosenthal (1968 11), a study 
in which the overall magnitude of expectancy effect was tabulated as 
a z of 00 although later contacted subjects showed significantly greater 
expectancy effects (z = +1 70). In the two other studies by this group 
showing no overall expectancy effect (z = 00) there were no order 
effects reaching a z of /I 28/ In the two studies by this same group 
showmg negative expectancy effects (zs = — 199, — 217), the first 
showed an increase of the negative expectancy effect over time 
(z » +185) but the second showed a decrease (z — — 146) 

Altogether, then, there arc 12 studies jn%cstigating tlic tendency for 
expectancy effects to increase as more subjects are seen Six of the results 
support the hypothesis at z > —1 28, one of the results runs counter 
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When the analysis was based only on the condition in which visual 
cues were available, the directional z for expectancy effect was +1 95 
Both the studies descnbed suggest that visual cues may also be impor- 
tant for the mediation of expectancy effects, though the experiment by 
Fode (Rosenthal and Fode, 1963b, 11) found mute but visible experi- 
menters to exert no expectancy effects Further indirect evidence for 
the importance of visual cues comes from the experiment by Woolsey 
and Rosenthal (1966) In the first stage of that experunent, subjects 
had no visual access to their expenmenters, but in the second stage 
they did When the screens were removed from between experi- 
menters and subjects, expectancy effects became significantly greater 
(z = -|-2 04) This evidence must be held very lightly, however, since 
experimenters contacting subjects with visual contact differed in several 
other ways from expenmenters contacting subjects without visual con- 
tact One difference was that expenmentei^ with visual contact had 
gained greater expenence, and more expenenced experimenters appear 
to show greater expectancy effects, a topic to which we now turn 

D. Expectancy Effects as Interpersonal Learning 
For a number of expenmenls on expectancy effects, sound motion 
pictures were available that had been obtained without the expenment- 
ers’ or subjects’ pnor knowledge The analyses of some of these films have 
been reported elsewhere (Friedman, 1967, Fnedman, Kurland and Rosen 
thal, 1965, Rosenthal, 1966, Rosenthal, 1967, Rosenthal, Friedman and 
Kurland, 1966) For all the hundreds of hours of careful observation, 
and for all the valuable things learned about expenmenter subject inter- 
action, no well specified system of unintentional cueing has been uncov- 
ered But if the students of experimenter behavior do not know how 
expenmenters unintentionally cue their subjects to give the expected 
response, then how do expenmenters themselves know how? Perhaps 
they do not know, but perhaps withm the context of the given experiment 
tliey can come to know Expectancy effects may be a learned phenome- 
non and learned in interaction with a senes of research subjects Each 
experimenter may have some types of unintended signaling m common 
with other expenmenters, but beyond that each expenmenter may have 
some unique unintended signals that work only for him Whether this 
IS so IS a problem for the psycholinguist, the pirabnguist, tlie kinesicist, 
and the sociohnguist But if there were this unique component to the 
unintentional cueing behavior of the expenmenter, it might account for 
our difficulty m trying to isolate very specific but very widespread cueing 
systems 

The expenmenter, who very likely kno^^s no more than we about 
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m this chapter Other sense modalities will also bear investigation, 
however 


For example, Geldard (1960) has brought into focus the role of the 
skin senses in human communication and has presented evidence that 
the skin may be sensitive to human speech Even when the sense modal- 
ity involved is the auditory, it need not be only speech and speech 
related stimuli to which the ear is sensitive Kellogg (1962), and Rice 
and Femstein (1965) have shown that, at least among blind humans, 
audition can provide a surprising amount of mformation about the en- 
vironment Employing a technique of echo ranging, Kelloggs subjects 
were able to assess accurately the distance, size, and composition of 
various external objects The implications for interpersonal communica- 
tion of these senses and of olfaction, or of even less commonly discussed 


modalities (eg, Ravitz, 1950, 1952), are not yet clear but are worthy 
of more intensive investigation 

Smce expectancies of another persons behavior seem often to be com 
municated to that person unintentionally, the basic experimental paradigm 
employed in our research program might be employed even if the inter- 
est were not in expectancy effects per se Thus if we were interested 
in unintentional communication among different groups of psychiatric 
patients, some could be given expectancies for others’ behavior Effec 
tiveness of umntentional influence could then be measured by the degree 
to which other patients were influenced by expectancies held of their 
behavior There might be therapeutic as well as theoretical significance 
to knowing what kind of psychiatnc patients were most successful in 
the unintentional influence of other psychiatric patients The expenment 
by Persinger, Knutson, and Rosenthal (1966, described in Rosenthal, 
1966) employed such a pradigm 

Twelve expenmenters administered a standard photo rating task to 
94 neuropsychiatric patients who could be classified as eit er re a ive y 
more anxious than hostile (schizophrenic or neurotic) or as relative y 
more hostile than anxious (paranoid or character disor er) ac expen 
menter was led to expect half his subjects to judge the stimulus photos 
as being of more successful people while the remaining su jec 
expected to judge the photos as being of less unsuccessful peop e 
made this experiment unusual was that the *7' 

selves patiente m a mental hospital who had been classified info th 

ate student expenmenters. mental patient e=T”ters obtained rc^ 
sponses from their mental patient subjects consistent with r 
tions (z = +188) Our pnmary interest m this study. 
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to the hypothesis at z < — 1 28, and five of the results neither support nor 
run counter to the hypothesis The overall directional z in support of the 
hypothesis is +3 06, p = 0011 The five studies by Uno’s group weie 
conducted in Japan, and for just that set of studies the combined z is less 
than +1 28 The remaining seven studies were conducted m the United 
States and for them the combined z was +3 21 Whether this difference 
may be due to differences in communication patterns between the two 
cultures is currently under mvestigation For the time being, at least, 
It seems reasonable to believe diat when there is a difference in magni- 
tude of expectancy effect from earlier to later contacted subjects, it is 
among the later seen subjects that expectancy effects are likely to be 
larger The hypothesis that the mediation of expectancy effects is learned 
by experimenters in the interpersonal context of the experiment, seems 
worthy of further investigation 


VIII. RESEARCH ON UNINTENDED INFLUENCE 

Quite apart from the methodological implications of research on ex- 
perimenter expectancy effects there are substantive implications for the 
study of interpersonal relationships Perhaps the most general implication 
IS that people can engage in effective unprogrammed and unintended 
communication with one another and that this process of unintentional 
influence can be investigated experimentally 

A great deal of effort within the behavioral sciences has gone into 
the study of such mtenlional influence processes as education, persua- 
sion, coercion, propaganda, and psychotherapy In each of these cases 
the mfiuencer intends to influence the recipient of his message and the 
message is usually encoded linguistically Without diminishing efforts 
to understand these processes better, greater effort should perhaps be 
expended to understand the processes of unmtentional influence m which 
the message is often encoded nonlinguistically The question, m short, 
is how people "talk” to one another holding constant what it is they 
say 

At the present time not only do we not know the specific signals by 
which people unintentionally influence one another, we do not even know 
all the channels of communication involved There is reason, though, 
to be optimistic There appears to be o great current increase of interest 
in nonhngmstic behavior as it may have relevance for human communi- 
cation (eg, Sebeok, Hayes, and Bateson, 1964) Most interest seems 
to have been centered m the auditory and visual channels of communica- 
tion and those are the channels imestigated m the research described 
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found that B type expenmenters showed greater effects of their expecta- 
tions than did A type expenmenters (z = -1-2 72) Although the literature 
of the AB vanable is addressed more to mental patients than to college 
students, we need only assume that college students are not so disturbed 
as schizophrenic patients to have Jenkms’ finding lend some support 
to the proposition that B type influencers are more effective with less 
disturbed mfluencers 

In his expenment, Trattner employed psychiatric aides as expen 
menters and hospitalized schizophrenics as his subjects Following the 
standard procedure, some subjects were represented to their experi- 
menters as success perceivers, others as failure perceivers When A type 
expenmenters contacted more chronically disturbed (process type) pa- 
tients, expenmenters showed greater effects of their expectations than 
when they contacted less chronically disturbed (reactive type) paUents 
Similarly consistent with what we might expect on the basis of the 
AB literature, the B type expenmenters were more successful uninten 
honal influencers when they contacted the less chronically chsturbc 
patients than when they contacted the more chronically disturbed 
tients (mteraction z = -j-lQl) j 

In the study by Persinger, et al,^ vanety of mental patients served 
as subjects while male and female ward personnel served as experi 
menters Once agam effectiveness of immtended communication was 
defined by the degree to which expenmenters obtained the responses 
they had been led to expect In this expenment patients wore not se 
lected on the basis of severity of disturbance but rather on tie 
of primary categonzation as relatively more anxious than hosti e (sc izo 
phrenic and neurotic) or as relatively more hostile than anxious (para 
noid and character disorder) Results showed that greater expertan^ 
effects were exerted by A type male expenmenters and by B type female 
expenmenters when patients were categonzed as relati\c ) more aaxious 
When patients were categonzed as relatively more hosti e, it was le 
type male expenmenters and tlic A t)'pe female expenmenters w lo 
showed the greater unintended effects of their expectations (m crac lon 


z s= 2 06^ 

The results of these studies lend support to Uic idei that the AB 
tanblc may be important in the prcdirtion of mterpersona in 
but that IS not tlie reason for their haring been reported hero llatlicr 
tlic maior purpose has been to illustrate the potential uti ity o s u ics 
of unintended interpersonal influence of the interpersonal cvpcctancr 

^ It srems to be an uncomplicated procedure to indn« in one mcmlicr 
of a dj-ad (A) an atpcctancj for the bcharaor of the other intmlx 
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to examine the magnitude of unintended influence or communication 
as a jomt function of the experimenters' and subjects’ nosologies Results 
showed that when both expenmenters and subjects could be character- 
ized as more anxious than hostile, experimenters showed the greatest 
positive unintended influence However, when both expenmenters aird 
subjects were charactenzed more by hostility than anxiety, the predicted 
unintended communication was least effective 

Findings of this kind may have implications for the treatment of psy 
chiatnc disorders The belief is increasing that an important source of 
informal treatment is the association widi other patients If, as seems 
likely, such treatment is more unintentional than intentional, then the 
grouping of patients might be arranged so that patients are put into 
contact with those other patients with whom they can “talk ’ best, even 
if this ‘ talk” be nonlmguisbc 

Perhaps success as an unintentional mfluencer of another’s behavior 
also has relevance for the selection of psychotherapists to work with 
certain types of patients The general strategy of trying to ‘ flt the thera- 
pist to the patient” has been considered and has aroused considerable 
interest (eg, Betz, 1962 ) That such selection may be made on the 
basis of unintentional communication patterns may also be suggested 
In one recent study, it was found that the degree of hostility m the 
doctor’s speech was unrelated to his success in getting alcoholic patients 
to accept treatment However, when the content of the doctors speech 
was filtered out, the degree of hostility found in the tone of his voice 
alone was significantly and negatively related to his success m influenc- 
ing alcoholics to seek treatment (Milmoe, Rosenthal, Blane, Chafetz, 
and Wolf, 1965, see also Milmoe, Novey, Kagan, and Rosenthal, 1968) 
One variable m particular, the “AB” vanable, has been employed 
in a promisuig senes of studies relevant to patient therapist pairing 
(Betz, 1962, Berzins and Seidman, 1968, Carson, 1967) There are indica 
tions that so called “A” type therapists ( as defined by a paper and pencil 
test) are more effectue with more disturbed patients while “B” type 
therapists are more effective ivith less disturbed patients With these 
indications in mind we conducted a senes of studies in which A and 
B type expenmenters administered the standard photo ratmg task to 
subjects under different conditions of expectation The general prediction 
was that the differential effectiveness of umntended communication by 
A and B type expenmenters vis-i-vis their subjects would parallel the 
differential therapeutic effectiveness of A and B type therapists vis-d-N is 
their patients Three such studies were conducted (Jenkins, 1966, Per- 
smger, Knutson, and Rosenthal, 1968, Trattner, 1968, 1968) 

For her sample of college student expenmenters and subjects, Jenkins 
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want to regard it as a phenomenon of general interest and one not 
restncted m its implications to the data-collectmg work of the behavaoral 
scientist, we must see whether interpersonal expectancies can also be 
made to show themselves m other mterpersonal contexts The context 
selected was that of ongoing educational systems (Rosenthal and Jacob- 


son, 1968) 

AU of the children in an elementaiy school serving a lower socio era 
nomic status neighborhood were administered a non-verbal test of mtelli- 
gence The test was disguised as one that would predict intelle^al 
“blooming” There were 18 classrooms m the school, three at each of 
the six grade levels Within each grade level the three classrooms were 
composed of children with above average abihty, average abdily, and 
below average abihty, respectively Within each of the 18 classrooms 
approxunately 20 per cent of the children were chosen at- random to 
form the experimental group Each teacher was given e names o e 
children from her class who were m the expenmenta con ion 
teacher was told that these children had scored on the test for in e - 
lectual blooming” such that they would ^ ' 

lectual competence dunng the next eight months o sc oo _ 
between the experimental group and the control group c 
was m the minds of the teachers 

Eight months later, at the end of the school year, 
were retested with the same IQ test This intel hgence ^ 

tively nonverbal m the sense of requinng no speaking, "7"^ 

vvasLentirely nonverbal 

but more abihty to re^on abstractly For shorthand 7" 

to the former as a “verbal" subtest and to the ^ 

suhtest The pretest correlation between groups 

For the school as a whole, the children of the 
showed only a slightly greater gam in espe- 

thc control group children However, m total IQ 

cially in reasoning IQ (7 points), the expenmcntal poiip children gamed 
appreciably more than did the control group c 1 r , , effects of 

men educational theorists * ,he l^rcn ^cr 

teacheRcxpectations.theyhave uma y re cric^^^^ therefore, to find 

levels of scholastic achievement It achievement 

S'," 'it S. -t; ,1.. 'r ' 
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(B) On the basis of the experiments summarized m this chapter, the 
odds are not unfavorable that the expectancy will be communicated 
to the other member of the dyad With that a likely occurrence, the 
student of processes of covert communication has a focus for his atten- 
tion He will be looking for what A does difiFerently m the interacbon as 
a function of what is expected from B 
Depending on the tastes and questions of the investigator, either an 
experimental or an observational approach can be employed If the inves- 
tigator were interested, for example, in finding out the proportion of 
information carried in various channels of communication, the openness 
of these channels could be systematically varied If the investigator 
were more interested in a global descnplion of communication processes, 
he might permit all channels to remain fully open while trymg to give 
as complete a description as possible of the type and amount of infor- 
mation earned in each channel This can be partially accomplished 
by having different observers focus on channels that have been artificially 
isolated from one another In the case of sound motion pictures, for 
example, some observers can be given access only to the silent film 
while others are given access only to the sound track Potentially as 
instructive as the analysis of individual channels of commumcation may 
be the analysis of differences between the signals sent through different 
channels 


IX. BEYOND THE LABORATORY 

The vast majority of the experiments summarized m this chapter were 
conducted m psychological laboratones Even when the subjects were 
not sophomores but psychiatric patients, the interaction between experi- 
menter and subject took place m a setting that unquestionably spelled 
"laboratory” For reasons quite apparent to any reader of this volume 
on artifacts m behavioral research, there are considerable advantages 
to testing laboratory-derived relationships m nonlaboratory settings The 
laboratory gives us convenience, the control of vanance increasing van- 
ables, and perhaps a sense of sccunly The world beyond the laboratory 
gives us inconvenience, frequent increases m error vanance, and a feeling 
of insecunty when mostly wc do our work m the basement of the Psy- 
chology Building But if a relationship obtained m the laboratoiy is 
to be vnewed as uncontamimlcd by tlic procedures, subjects, and setting 
of the laboratoiy’ itself, it must be taken out of the artificial light of 
the lab and examined in the harsher light of the world beyond 
In tlie case of the vanable of interpersonal expectancy, if wc should 
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even one whose IQ is rising, to be seen by his teacher as a well adjusted 
child and as a potentially successful child, intellectually 
The effects of teacher expectations had been most dramatic when 
measured in terms of pupils’ gams in reasoning IQ These effects on 
reasoning IQ, however, were not uniform for boys and girls Although 
all the children of this lower socio economic status school gained dra 
matically in IQ, it was only among the girls that greater gams were 
shown by those who were expected to bloom compared to the children 
of the control group Among the boys, those who were expected to 
bloom gained less than did the children of the control group (interac 


tion F = 9 27, p = 003 ) 

In part to check this ffnding, the expenment originally conducted 
on the West Coast was repeated m a small Midwestern town (Rosenthal 
and Evans, 1968) This time the children were from substantial middle 
class backgrounds, and this time the results were completely an 
cantly reversed Now it was the boys who showed the benefits o avo^ 
able teacher expectations Among the girls, those who were expected 
to bloom mtellectually gained less m reasoning IQ than did the girts 
of the control group (interaction F = 910, p = 003) Just as m e 
West Coast expenment, however, all the children showed substantia 
gams m IQ These results, while they suggest the potentially powerful 
effects of teacher expectations also indicate the proba e comp exity 
of these effects as a function of pupils sex, social class, an , as ime 
Will no doubt show, other variables as well 

In both the expenments described, IQ gams were assessed after a 
full academic year had elapsed However, the results of another erpen- 
ment suggest that teacher expectations can significant y oc s ^ 
mtellectual performance in a period as short as two uiont ^ f ^ erson 
and Rosenthal, 1968) In this small experiment, the ^ children were 
mentally retarded boys with an average pretest IQ of 46 Expectancy 
effects were significant only for reasoning IQ and only m interaction 
with membership m a group receiving special^ lemednl reading instruc- 
tion in addition to participating m the schools summer ay camp pro 
gram (p < 03) Among these specially tutored boys those « ho were 
expected to bloom showed an expectancy disadvantage o near y 
IQ points, among the untutored boys who were participating onl) m 
the school’s summer day camp program, those \\ o were cx^c c 
bloom showed an expectancy advantage of just over ^ ^ , 

(For verbal IQ, in contrast, the expectancy disadvantage of the tutored 
boys was less than one IQ point, while the expectancy a van age or 
the untutored boys was ov cr tw o points) 

The results described were based on posttesting only two monUis 
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to descnbe the classroom behavior of their pupils Those children from 
whom intellectual growth was expected were described as havmg a sig- 
nificantly better chance of becoming successful in the future, as signifi- 
cantly more interesting, curious, and happy There was a tendency, too, 
for these children to be seen as more appealing, adjusted, and affection- 
ate and as lower m the need for social approval In short, the children 
from whom intellectual growth was expected become more intellectually 
alive and autonomous, or at least they were so perceived by their 
teachers 

We have already seen that the children of the experimental group 
gained more intellectually so diat perhaps it was the fact of such gaming 
that accounted for the more favorable ratings of these children’s behavior 
and aptitude But a great many of the control group children also gamed 
in IQ during the course of the year We might expect that those who 
gamed more intellectually among these undesignated children would 
also be rated more favorably by their teachers Such was not the case 
The more the control group children gamed in IQ the more they were 
regarded as less well adjusted, as less mterestmg, and as less affectionate 
From these results it would seem that when children who are expected 
to grow intellectually do so, they are considerably benefited in other 
ways as well When children who are not especially expected to develop 
intellectually do so, they seem either to show accompanying undesirable 
behavior or at least are perceived by their teachers as showing such 
undesirable behavior If a child is to show intellectual gam it seems 
to be better for his real or perceived intellectual vitality and for his 
real or perceived mental health if his teacher has been expecting him 
to grow intellectually It appears worthwhile to investigate further the 
proposition that there may be hazards to unpredicted intellectual growth 
A closer analysis of these data, brolcen down by whether the children 
were m the high, medium, or low ability tracks or groups, showed that 
these effects of unpredicted intellectual growth were due primarily to 
the children of the low ability group "Vl^en these slow track children 
were in the control group so that no intellectual gains were expected 
of them, they were rated more unfavorably by their teachers if they 
did show gams m IQ The greater their IQ gams, the more unfavorably 
were they rated, both as to mental health and as to intellectual vilaUty 
Even when the slow track children were in the experimental group 
(so that IQ gams were expected of them) they were not rated as favor- 
ably relative to their control group peers as were the children of the 
high or medium track despite the fact that they gamed as much m 
10 relative to the control group children as did the experimental group 
children of the high group It may be difficult for a slow track child, 
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cues may be important in the covert communication of interpersonal 
expectations 

In all the experiments described so far, the same IQ measure was 
employed, the Flanagan (1960) Tests of General Ability Also emplojang 
the same instrument with his sample of first graders, Claibom (1968) 
found a tendency (2 = — 145) for children he designated as potential 
bloomers to gam less in IQ than the children of the control group 
With fifth grade boys as his subjects and males as teachers, Pitt ( 1956 ) 
found no effect on achievement scores of arbitrarily adding or subtract 
mg ten IQ pomts to the childrens records In her study, Heiserman 
(1967) found no effect of teacher expectations on her 7th graders’ stated 
levels of occupational aspiration 

There have been two studies in which teachers expectations were 
varied not for specific children withm a classroom but rather for class- 
rooms as a whole (Biegen, 1968, Flowers, 1966) In both cases t e 
performance gams were greater for those classrooms expected by eir 
teachers to show the better performance 
A radically different type of performance measure was employed in 
the research by Burnham (1968), not intelligence or scholastic achieve- 
ment this time, but swimming ability His subjects were boys and gir 
aged 7-14 attendmg a summer camp for the disadvantaged None of the 
children could swim at the beginning of the two week e^enmental 
period Half the children were alleged by the camp staff to have shown 
unusual potential for leammg to swim as judged from a hatteiy ot 
psychological tests Children were, of course, assigned to the high po- 
tenhal’ group at random At the end of the nvo week penod of the 
expenment all the children were retested on the standard Red Cross 
Begmner Swimmer Test Those children who had been expected to show 
greater improvement m swimming ability showed greater improvement 
than did the children of the control group 

We may conclude now with the bnef descnption 0 just one more 
expenment, this one conducted by Beez (1968). who hndly made h.s 
data available for the analyses to follow This time pupils were 
60 preschoolers from a summer Headstart program Each child was 
taught the meaning of a series of symbols by one teacher Half the 
60 teachers had been led to expect good symbol learning and half had 
been led to expect poor symbol learning Most (77 percent) of the chil- 
dren alleged to have better mtellectual prospects learned five or more 
symbols, but only 13 per cent of the children alleged to hare poorer 
intellectual prospects learned five or more s>anbols (p < 2 in one mil- 
lion) In this study the children’s actual performance vvas assessed b) 
an experimenter who did not know what the child s teacher had been 
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after the initiation of the experiment Follow up testing was undertaken 
seven months after the end of the basic experiment In reasoning IQ, 
the boys who had been both tutored and expected to bloom intellectually 
made up the expectancy disadvantage they had shown after just two 
mondis Now their performance change was just like that of the control 
group children, both groups showing an IQ loss of four points over 
the nine month period Compared to these boys who had been given 
both or neither of the two expenmentai treatments, the boys who had 
been given either tutoring or the beneBt of favorable expectations 
showed significantly greater gams in reasoning IQ scores (p < 025) 
Relative to the control group children those who were tutored showed 
a 10 pomt advantage while those who were expected to bloom showed 
a 12 point advantage While both tutoring and a favorable teacher expec 
tation were effective in raising relative IQ scores, it appeared that when 
these two treatments were applied simultaneously, they were ineffective 
in producing IQ gains over the period from the beginning of the expen 
ment to the nine month follow up One possible explanation of this 
finding IS that the simultaneous presence of both treatments led the 
boys to perceive too much pressure The same pattern of results reported 
for reasoning IQ was also obtained when verbal IQ and total IQ were 
considered, though the interaction was significant only m the case of 
total IQ (p< 03) 

In the experiment under discussion a number of other measures of 
the boys’ behavior were available as were observations of the day camp 
counselors’ behavior toward the boys Preliminary analysis suggests that 
boys who had been expected to bloom intellectually were given less 
attention (p — 09) by the counselors and developed a greater degree 
of independence (p < 02) compared to the boys of the control group 

Another study, ^is tune conducted in an East Coast school with upper 
middle class pupils, again showed the largest effect of teachers’ expec 
tancies to occur when the measure was of reasoning IQ (Conn Edwards, 
Rosenthal, and Crowne, 1968) In this study, both the boys and the 
girls who were expected to bloom intellectually showed greater gams 
in reasoning IQ than did the boys and girls of the control group, and 
the magnitude of the expectancy effect favored the girls very slightly 
Also m this study, we had available a measure of the children’s accuracy 
in judging the \ocal expressions of emotion of adult speakers It was 
of considerable theoretical interest to find that greater benefits of favor- 
able teacher expectations accrued to those children who were more ac- 
curate in judging the emotional tone expressed m an adult female’s 
voice These findings, taken together with the research of Adair and 
Epstein (1967) described earlier, give a strong suggestion that vocal 
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TABLE XXIII 

Expectakcy Effects in Educational Settings 



Study 


Directional standard 
normal deviate 

Dependent 

variable 

1 

Anderson and Rosenthal 

1968 

00“ 

Total IQ 

2 

Beez 

1968 

+ 4 67 

Symbol learning 

3 

Biegen 

1968 

+ 1 83 

Achiev ement 

4 

Burnharn& 

1968 

+ 2 61“ 

Swimming skill 

5 

Claiborn 

1968 

- 1 45“ 

Total IQ 

6 

Conn, et al 

1968 

00“ 

Total IQ 

7 

Flowers 

1966 

+ 1 60 

Achievement + IQ 

8 

Heiserman 

1967 

00 

Aspiration 

9 

Pitt 

1956 

00 

Achievement 

10 

Rosenthal and Evans 

1968 

00“ 

Total IQ 

11 

Rosenthal and Jacobson 

1968 

+ 2 11“ 

Total IQ 



Sum 

+1! 37 




VTi 

3 32 




Z 

+ 3 42 




V 

00033 



* Indicates that teacher expectancy interacted with another variable at a > /I 28/ 
^ See also Burnham and Hartsough (1968) 


effect of teacher expectation, it should be noted that three of them 
showed significant interactions of teacher expectation with some other 
primary variable such as special tutoring (study 1), accuracy of emotion 
perception (6), and sex of pupil (10) The combined one tail p of 
the main effects of teacher expectancy in the studies shown in Table 
XXIII IS less than 1 in 3,000 It would take an additional 37 studies 
of a mean associated z value of 00 to bring the overall combined p 
to above 05 ® 

Shall we view this set of experiments m natural learning sitinlions 
in isolation or would it be wiser to sec them simply as more of the 
same type of experiment that has been discussed throughout this chap- 

• Combining the ps of the 103 studies of Tiblcs X\IV together with the results 
of the nine studies of footnote 2 gives a grand sum z of +112 23 The oscnll 
z assoaated with this set of results is +1051 md 4 5J0 new txjwrimcnts with 
a mean z of 00 arc requfreef to hnng the «»craH p Icwl to 03 (As this chupUr 
went to press, the results of anotlier stud^ of teacher expectation tfri-cts Ik came 
available Sleichcnbaum Bowers and Ross at the Unwenit^ of Waterloo, found 
that favorable teacber cxpcctnlions I«1 to a significant increase m the appropriateness 
of classrtsotn bebasior of a sample of adolesctnt fcrmle offendtrs (iff ^ 12 
= - +2 02 ) 
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told about the child’s intellectual prospects Teachers who had been 
given favorable expectations about their pupil tried to teach more sym 
bols to their pupil than did the teachers given unfavorable expectations 
about their pupil The difference m teaching effort was dramatic Eight 
or more symbols were taught by 87 per cent of the teachers expecting 
better performance, but only 13 per cent of the teachers expectmg poorer 
performance tried to teach that many symbols to their pupil (p < 1 
in 10 million) 

These results suggest that a teaeher’s expectation about a pupil’s per- 
formance may sometimes be translated not into subtle vocal nuances 
but rather into overt and even dramatic alterations in teaching style 
The magnitude of the effect of teacher expectations found by Beez is also 
worthy of comment In all the eather studies described, one group of 
children had been singled out for favorable expectations while nothing 
was said of the remaining children of the control group In Beez’ short 
term expenment it seemed more justified to give negative as well as 
positive expectations about some of the children Perhaps the very large 
effects of teacher expectancy obtained by Beez were due to the creation 
of strong equal but opposite expectations m the minds of the different 
teachers Since strong negative expectations doubtless exist in the real 
world of classrooms, Beez’ procedure may give the better estimate of 
the effects of teacher expectations as they occur m everyday life 

In the expenment by Beez it seems clear that the dramatic differences 
in teaching style accounted at least m part for the dramatic differences 
in pupil learning However, not all of the obtamed differences in learners’ 
learning was due to the differences m teachers’ teaching Within each 
condition of teacher expectation, for example, there was no relationship 
between number of symbols taught and number of symbols learned 
In addition, it was also possible to compare the performances of just 
those children of the two conditions who had been given an exactly 
equal amount of teaching benefit Even holding teaching benefits con- 
stant, the difference favored the children believed to be superior 
(t = 2 89, p < 005, one tail) though the magnitude of the effect was 
now diminished by nearly half 

We have now seen at least a brief description of II studies of the 
effects of mterpersonal expectancies in natural learning situations That 
IS too many to hold easily in mind and Table XXIII pro\ides a conve 
nient summary For each expenment the directional standard normal 
deviate is given as well as a brief identification of the dependent van 
ables employed As has been the custom m this chapter, a standard 
normal deviate greater than — 1 28 and smaller than -f-l 28 has been 
recorded as zero Of the fi\e experiments tabulated as showing no mam 
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TABLE XXV 

Proportions of Experimenters and Teachers 
Showing Expectancy Effects 



Experimenters 

Teachers 

Number of Studies 

57 

0 

Median z 

00 

00 

Number of Es or Ta 

523 

115 

Mean N per Study 

9 

19 

Weighted Percent of Biased Ea or Ts 

69% 

75% 

Median Percent of Biased Es or Ts 

7;.% 

66% 


can be expected to show the effects of their expectation on the per- 
formance of their subjects or pupils 

This chapter began its discussion of interpersonal expectancy effects 
by suggesting that the expectancy of the behavioral researcher might 
function as a self fulfilling prophecy This unintended effect of the inves 
tigator’s research hypothesis must be regarded as a potentially damaging 
artifact But interpersonal self-fuJfilhng prophecies do not operate only 
in laboratones and while, when there, they may act as artifacts, they 
are more than that Interpersonal expectancy effects occur also among 
teachers and, there seems no reason to doubt it, among others as well 
What started life as an artifact continues as an interpersonal variable 
of theoretical and practical interest Today's artifact, as Bill McGuire 
so wisely said, is tomorrow’s mam effect, and tomorrow is today 
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ter^ Since the type of expenmental manipulation involved m the labora- 
tory studies IS essentially the same as that employed in the studies be 
yond the laboratory, it seems more parsimonious to view all the studies 
as members of the same set If, m addition to the communality of 
experimental procedures, we find it plausible to conclude a com- 
munality of outcome patterns between the laboratory and field experi- 
ments, perhaps we can have the greater convenience and power of speak- 
ing of )ust one type of effect of interpersonal expectancy Table XXIV 
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ment IS very good Depending upon the particular method of computa- 
tion selected, about 7 out of 10 experimenters or 7 out of 10 teachers 

TABLE XXIV 


Percentage of Studies of E^ectancy Effects in 
Ladoratoiues and Educational Settings 
Obtaining Results at Specified p Levels 


p 

Laboratones 
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Educational settings 

N = 11 

10 
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05 

35% 

36% 

01 

17% 

18% 

001 

12% 

9% 

0001 

5% 

9% 

00001 

3% 

9% 

000001 

2% 

0% 

Grand Sum z 

+95 27 

+ 11 37 

Mean 2 

+ 1 01 

+ 1 03 
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THE CONDITIONS AND CONSEQUENCES 
OF EVALUATION APPREHENSION 


Milton J. Rosenberg 
University of Chicago 


Just as it keeps rats pressing levers, intermittent reinforcement keeps 
psychologists theonzing and neologizing The best remforcer I know 
is not the student's imitation of his professor's crotchets, nor is it a 
“successful replication” of one’s experiment by another mstead it is 
to have some theoretical term that one has corned be often quoted 
and then to watch the quotation marks fade away as the term begins 
to enjoy some common usag'e To put a phrase into the lan^ag^e ( even 
if that language is spoken by only a few dozen others) confirms the 
sometimes faltenng sense that one has really said something 

This seems to have begun to happen with the term “evaluation appre- 
hension” which I first used in some unpublished documents in 1960-61 
and in an obscure article in 1963, and which I then explicated in a 
more visible one in 1965 Yet the diflFusion of the term is not at all 
due to its being the key to some arcane and profound insight Most 
expenmental psychologists had long smce come to the unhappy aware- 
ness that their subjects were prone to “faking it” and, particularly, to 
^ it “good ” But as a sort of contrast to Mark Twam’s aphonsm 
people and the weather, the problem of self presentation in expen- 
> seemed to be something that virtually nobody was talking about® 

J a great deal could be done about it 

* One clear voice that helped break the silence was that of Henry Riecken 
a valuable article published in 1962 he proffered a general \iew of the psychologi- 
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That the term “evaluation apprehension” has recently gained some 
currency must, then, be due to its helping to fill a need — the need, 
I should say, for experimental psychologists, social and otherwise, to 
come to terms with an obvious and fascinatmg source of trouble m 
their experimental procedures and rituals In recent years my own sense 
of that need has led me beyond the mibal conceptuahzation and into 
this possibly paradoxical commitment to try to do systematic expenmen 
tation on evaluation apprehension as a source of systematic bias in psy 
chological expenments 

This chapter is intended as a rather loose, narrative account of the 
mam directions taken and the major findings gleaned in that research 
program All but the first of the studies to be descnbed are previously 
unpublished, though they have been presented in various colloquia over 
the last two years Some of these studies will be descnbed in full detail 
in forthcoming articles, and it is my ambition to bring all this work, 
and related studies, into tight but expansive focus in an as yet unwritten 
book 


I EVALUATION APPREHENSION AS CONCEPT AND 
PROCESS 


To begin, I had better not assume that the partial diffusion of the 
term “evaluation apprehension’ has also spread abroad its full intended 
conceptual meaning Thus what is called for, first, is a statement of 
definition Then I shall need to outline my conception of how evaluation 
apprehension gets aroused and, after arousal, sometimes interacts with 
features of the experimental situation in ways that produce systematic 
biasing of cxpenmcntal response data Following these necessary pre- 
liminaries I shall turn, m the last portion of this introductory section, 
to some of the reasoning that lies behind the basic conceptualization 
and then we can begin to look at its research implications 


expenment as a sort ot ntualizcd exchange between subject and experimenter 
An important aspect of the exchange dynamic, as he saw it was the subjects 
csirc to put his best foot forward However, in Riecken’s view, this was bisicallj 
a wurce o unintended variance In data and the possibility that it could exert 
ys ema ic mtluence making for false conRrmation or disconfirmation of hypotheses 
was not directly examined 

self presentaUon process were the inquiries by Edwards 
Marlowt (196-1) concerning the "social desirability 'ad 
descnbed In this chapter, their basic interest has 
iMiinf. « *1 influence of positive self presentation upon psychological 

tesUng and iu results rather than upon psjcbological cipenments 
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What, then, is the working concepbon of evaluation apprehension 
around which my recent research and this chapter are organized? The 
summary given in an earlier article (Rosenberg, 1965) is, I think, worth 
repeating here 

‘It IS proposed that the typical human subject approaches the typical 
psychological experiment with a preliminary expectation that the psy- 
chologist may undertake to evaluate his (the subject’s) emotional ade- 
quacy, his mental health or lack of it Members of the general public, 
including students m introductory psychology courses, have usually 
learned (despite our occasional efforts to persuade them otherwise) 
to attribute special abilities along these lines to those whose work is 
perceived as involving psychological interests and skills Even when 
the subject is convinced that his adjustment is not being directly studied 
he IS likely to think that the experimenter is nevertheless bound to be 
sensihve to any behavior that bespeaks poor adjustment or immatunty 

'In experiments the subject’s mibal suspicion that he may be exposing 
himself to evaluation will usually be confirmed or disconfirmed (as he 
perceives it) in the early stages of his encounter with the experimenter 
Whenever it iS confirmed, or to the extent that it is, the typical subject 
will be likely to expenence evaluation apprehension, that is, an acbve, 
anxiety toned concern that he win a posibve evaluabon from die expen- 
menter, or at least that he provide no grounds for a negative one Per- 
sonahty variables will have some bearmg upon the extent to which this 
pattern of apprehension develops But equally important are vanous 
aspects of the expenmental design such as the experimenter’s explanatory 
‘pitch,’ the types of measures used, and the experimental manipulations 
themselves 

"Such factors may operate with equal potency across all cells of an 
expenment, but we shall focus upon die more troublesome situabon 
m which treatment differences between expenmental groups make for 
differential arousal and confirmation of evaluation apprehension The 
particular diflSculty with this state of affairs is that subjects in groups 
expenencmg comparabvely hi^ levels of evaluation apprehension will 
be more prone than subjects in other groups to interpret the expen- 
menter’s instructions, explanations, and measures for what they may 
convey about the kinds of responses that will be considered healthy 
or unhealthy, mature or immature In other words, they will develop 
hypotheses about how to win positive evaluation or to avoid negative 
evaluation And usually the subjects in such an expenmental group are 
enough alike in their perceptual reactions to the situation so that there 
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Will be considerable similarity in the hypotheses at which they separately 
arrive This similanty may, in turn, operate to systematically influence 
experimental responding in ways that foster false confirmation of the 
experimenter’s predictions ” 


What suggests this view of the secret side of the structured transaction 
behveen experimenter and subject? What, if anything, confirms the view? 

One answer to the first of these questions concerns the modal theme 
that is usually encountered when one engages subjects in extended post- 
experimental discussion Experienced experimenters who bother to talk 
to their subjects have all heard questions like these "How did I do— 
were my responses (answers) normaP “What were you really trying 
to find out, whether I’m some kind of neurotic?” “Did I react the same 
as most people do?’ If one goes further m postexpenmental inquiry, 
as ave re^larly tried to do in recent years in my experimental work 
on attitude change (see Abelson, et al, 1968, Rosenberg et al, 1960), 
and asfa subjects to attempt a reconstruction of their private experience 
I fakiT'™!" ‘'f o»e often picks up another theme that 

tamw 'oport-sometimes with uncer- 

tainty and sometimes with great clarity-that they were burdened or 

S and tb^? of this expen- 

waf reUaled to tl?'™ 7T the expenmental situation 

^xperimente^r oft” further mstnicbons from the 

signed to elicit' den"'d encounter with the instrument de- 

o^insiAt” rblfww'^^ a flash 

abouTre” mlb f ®’Te"menter was -really taymg to find out 
are of the sort tha^ oa “'““f always incorrect they 

m the expenmental situatan°-ae“fcri,fsuch'’'"fl‘’' 

•He^t.setMhT^X 

Thus, conversations mth subieots i 

and colleagues as they muse uion tbl ^ g’' 0 'fuate students 

>ears when they were^he recnfited sub f™™ undergraduate 

have helped to shape the bSw of J''*"' *<= recruiters) 

Sion procL Yet an^ote^tTnCf? 

cxpenenced expenmental social rnvobol ” ® '’oon the fact that 

biiic style whL enfaged m wof “ certain 

IS that xshen they sufp^^t .iatrmro" el I -oan 

the hypothesis can’t he that true- their lirst\f 
usually to suggest that somethmg about the efio ^“'t'P^otat.on is 
or manipulahons probably -aroused' the ^bieS ™ ' ‘""'^ohons 

UDjects in some unintended 
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way or direction. Who has not heard reinterpretations similar to these 
illustrative ones? “The instructions probably made the subjects in the 
experimental group quite aaxious about how they would be accepted 
and that, rather than the attributions of expertise as such, would be 
enough to make them conform to the views of the other group mem- 
bers.” Or; “By telling the subj*ects that prejudiced people are people 
who have repressed their hostility toward parents you are really making 
it necessary for them to show that the tolerance message influences 
them; it isn’t ‘insight’ that accounts for the change, it’s their need to 
get the psychologist’s approval.” 

The penchant for this sort of reinterpretation in terms of self-presenta- 
tion dilemmas is widespread. Is this simply because it is a normative 
style in our profession; or has it become so because it reflects a persisting 
social psychological reality in the conduct of psychological research? 
Obviously, I would suggest that the latter is the case. 

But if observations and speculations of the sort that I have indicated 
have helped to suggest the evaluation apprehension view, they do not, 
of course, in any way serve to confirm it. Confirmation can only be 
accomplished through further research. Thus, one of die basic aims of 
the experimental program that I and my various colleagues have been 
conducting has been to demonstrate that evaluation apprehension, once 
aroused, can significantly influence dependent variable data. We have 
intended also to show that this influence often works not merely to 
increase “random error variance” but rather that it exerts systematic 
bias upon experimental responding; i.e., it “tilts” data distributions to- 
ward one or the other end of the response continuum and thus generates 
“significant” findings that happen also to be illusory ones. 

We have had other purposes in mind as well — particularly to investi- 
gate the conditions under which evaluation apprehension is more or 
less likely to be aroused and, if aroused, more or less likely to induce 
systematic bias in dependent variable data. 

I shall return to these matters later. Our first task is to review and 
discuss some “demonstration” studies. What they are intended to demon- 
strate is, simply, that when evaluation apprehension is aroused (and 
when it is coupled with the provision of cues that hint how the normal 
or “healthy” person would be likely to respond) this can induce sys- 
tematic bias. Of course, it must be clearly understood that any demon- 
stration that this can happen does not establish that it always or usually 
will happen. But there is no point in worrying about evaluation appre- 
hension at all or in spending effort on trying to control and reduce 
it, unless we have first satisfied ourselves that it can actually be shown 
to exert biasing influence upon experimental responding. 
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will be considerable similarit) in the hypotheses at which they separately 
arnve This similanty may, in turn, operate to systematically influence 
experimental responding m ways that foster false confirmation of the 
experimenter’s predictions 

What suggests this view of the secret side of the structured transaction 
between experimenter and subject? What, if anything, confirms the view? 

One answer to the first of these questions concerns the modal theme 
that is usually encountered when one engages subjects m extended post- 
experimental discussion Expenenced experimenters who bother to talk 
to their subjects have all heard questions like these “How did I do — 
were my responses (answers) normal?’ “What were you really trying 
to find out, whether I’m some kind of neurotic?” “Did I react the same 
as most people do^’ If one goes further m postexpenmental inquiry, 
as I have regularly tried to do in recent years m my experimental work 
on attitude change (see Abelson, et ol, 1968, Rosenberg et al, 1960), 
and asks subjects to attempt a reconstruction of their private expenence 
of the experimental transaction, one often picks up another theme that 
I take to be quite significant Subjects will report — sometimes with uncer- 
tainty and sometimes with great clarity — that they were burdened or 
preoccupied with the question “What is the real purpose of this experi- 
ment?’, and that when some sinking aspect of the expenmental situation 
was revealed to them (whether through further instrucbons from the 
expenmenter or, often, through first encounter with the instrument de- 
signed to ehcit dependent vanable measures) this generated a flash 
of insight about what the expenmenter was “really trying to find out 
about me Though such “insights” are almost always incorrect they 
are of the sort that is capable of affecting the subject’s further behavior 
in the expenmental situation The fact that such influence upon expen- 
mental responding has occurred is often the precise burden of the sub- 
ject’s remarks 

Thus, conversations with subjects, (and also with graduate students 
and colleagues as they muse upon their memones from undergraduate 
years when they were the recruited subjects rather than the recruiters) 
have helped to shape the basic conception of the evaluation apprehen- 
sion process Yet another contnbubng influence has been the fact that 
expenenced expenmental social psychologists seem to share a certain 
basic style when engaged in professional "yesbuhsm” What I mean 
IS that when they suspect that someone else’s data “are too neat— and 
the hypothesis can’t be that true” their first hne of reinterpretation is 
usually to suggest that somethmg about the experimental instructions 
or manipulabons probably “aroused’ the subjects m some unintended 
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way or direction. Who has not heard reinterpretations similar to these 
illustrative ones? “The instructions probably made the subjects in the 
experimental group quite anxious about how they would be accepted 
and that, rather than the attributions of expertise as such, would be 
enough to make them conform to the views of the other group mem- 
bers.” Or: “By telling the subjects that prejudiced people are people 
who have repressed their hostility toward parents you are really makhig 
it necessary for them to show that the tolerance message influences 
them; it isn’t ‘insight’ that accounts for the change, it’s their need to 
get the psychologist’s approval.” 

The penchant for this sort of reinterpretation in terms of self-presenta- 
tion dilemmas is widespread. Is this simply because it is a normative 
style in our profession; or has it become so because it reflects a persisting 
social psychological reality in the conduct of psychological research? 
Obviously, I would suggest that the latter is the case. 

But if observations and speculations of the sort that I have indicated 
have helped to suggest the evaluation apprehension view, they do not, 
of course, in any way serve to confirm it. Confirmation can only be 
accomplished through further research. Thus, one of the basic aims of 
the experimental program that I and my various colleagues have been 
conducting has been to demonstrate that evaluation apprehension, once 
aroused, can significantly influence dependent variable data. We have 
intended also to show that this influence often works not merely to 
increase "random error variance” but rather that it exerts systematic 
bias upon experimental responding; i.e., it "tilts” data distributions to- 
ward one or the other end of the response continuum and thus generates 
^‘significant” findings that happen also to be illusory ones. 

We have had other purposes in mind as well — particularly to investi- 
gate the conditions imder which evaluation apprehension is more or 
less likely to be aroused and, if aroused, more or less likely to induce 
systematic bias in dependent variable data. 

I shall return to these matters later. Our first task is to review and 
discuss some “demonstration” studies. What they are intended to demon- 
strate is, simply, that when evaluation apprehension is aroused (and 
when it is coupled with the provision of cues that hint how the normal 
or "healthy” person would be likely to respond) this can induce sys- 
tematic bias. Of course, it must be clearly understood that any demon- 
stration that this can happen does not establish that it always or usually 
will happen. But there is no point in worrying about evaluation appre- 
hension at all or in spending effort on trying to control and reduce 
it, unless we have first satisfied ourselves that it can actually be shown 
to exert biasing influence upon experimental responding. 
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monetary reward) this will induce more attitude change in the advocate 
than when counterattitudinal advocacy is undertaken with strong justi- 
fication (e.g. for a comparatively large monetary reward). 

However, as many observers (among them Chapanis and Chapanis, 
1964; Brown, 1962) have pointed out, dissonance studies of this type 
confront the subjects (particularly those in “low dissonance” experimental 
groups) with startling and ambiguous experiences and conditions. Agree- 
ing with Chapanis and Chapanis that a likely consequence will be the 
arousal of “suspicion,” I thought it possible to be even more specific 
about the intervening, response-affecting, patterns of arousal that may 
occur with subjects in such studies. The particular case in point upon 
which we focussed was the well-known study by Cohen, In this experi- 
ment Yale undergraduates had been recruited to write essays in support 
of a position opposite to the one they actually held on a currently salient 
campus issue. The issue concerned “the actions of the New Haven police” 
in a recent campus riot. The undergraduates uniformly felt that the 
police had behaved badly. The essay they were requested to write was 
on the topic: “Why the actions of the New Haven police were justified.” 

Having appeared at randomly chosen dormitory rooms the experi- 
menter requested the potential subject to write such an essay and as 
an inducement offered a financial reward of either $.50, $1.00, $5.00 
or $10.00. After the essay had been completed the experimenter asked 
the subject to fill out an attitude measure indicating how much he ap- 
proved or disapproved of “the actions of the New Haven police.” As 
this measure was handed to him the subject was invited to take into 
account, if he so chose, the pro-police arguments he had just improvised 
in writing the counterattitudinal essay. 

The prediction derived from dissonance theory was that the lesser mag- 
nitude of reward would generate a greater magnitude of dissonance and 
thus greater attitude change; i.e., an inverse monotonic relationship was 
expected between the amount of money offered to elicit the counter- 
attitudinal advocacy and the degree of altitude change toward the 
pro-police position. This prediction was apparently confirmed; the $.50 
reward group showed greatest altitude change in the pro-police direc- 
tion, the $1.00 group ne.xt greatest change and the $5.00 and $10.00 
reward groups did not differ from a control group which, without any 
prior counterattitudinal advocacy, had merely filled out an attitude scale 
concerning the question of whether “the actions of the New Haven 
police” were justified. 

On the basis of attitude theory considerations that need not be re- 
viewed here I thought that the opposite prediction made more sense; 
that the degree of attitude change would be a positive, rather than 
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II DrMONSrRATION THROUGH ALTERED REPLICATION 

There are at least two ways m which our basic point can be demon- 
strated The one that will occupy us now is, in essence, a classic strategy 
It IS the one that is commonly employed when one suspects that the 
findings obtained in some reported ‘successful’ eKperiment are in reality 
not due to the validity of the experimenter’s hypothesis but to some 
unintended influence let loose by his poorly designed operations of ma- 
nipulation or measurement In this strategy one redesigns the suspected 
operations and repeats the experiment If, despite the operational 
changes, the original findings are replicated one now has presumptive 
evidence that ones objections and doubts were ill taken, if meaningfully 
different, nonreplicative data are obtained, one has some claim (though 
it should not be overindulged) to emit the prideful chortle ‘I told 
you so” or ‘ Thus do I refute Professor Berkeley ” 

How does this bear upon our intention to confirm, by empirical demon- 
stration, that unsuspected arousal of evaluation apprehension does some- 
times generate false confirmations of hypotheses^ Obviously, when we 
suspect that this has happened and where we have a speculative inter- 
pretation of how it happened, we may undertake an altered replication 
0 the original study The object would be to change those operations 
which we believe to hive aroused evaluation apprehension and to have 
fostered the expectation that a certain way of responding would bring 
positive evaluation from the expenmenter If such an altered replication 
vvere to jaeld data that, as predicted, were quite different from the 
findings of the original study this could be taken as evidence that our 
original concern over evaluation apprehension was neither excessive nor 
misplaced In effect, such an outcome would be a demonstration, through 
erL r apprehension can gen- 

demons^a “on ™ ■ncontrovert.bly defin.Uve 

r one employed 

m the demonstrahon phase of our mqu.ry into the evaluation apprehen- 
Sion phenomenon The substantive arm nf ^ i 

rinrf ^ u oi conccm was research in sup- 

port of a basic hypothesis derived from cognitive dissonance theoiy 

(IJoJ) and Cohen (in Brchm and Cohen IQROl ji 

firmed the hypothesized rehtio.hip svh::; 

Oo, arguing in support of an attitude position opposite to one’s oJ 
true consaction) is undertaken with little justification (eg for a small 
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m this study, as in others of similar design, the low dissonance 
(high-reward) subjects would be more likely to suspect that the experi- 
menter had some unrevealed purpose The gross discrepancy between 
spending a few minutes wnting an essay and the large sum offered, 
the fact that this large sum had not yet been delivered by the time 
the subject was handed the attitude questionnaire, the fact that he vas 
virtually invited to show that he had become more positive toward the 
New Haven police all these could have served to engender suspicion 
and thus to arouse evaluation apprehension and negative affect toward 
the expenmenter Either or both of these motivating states could prob 
ably be most efficiently reduced by the subject refusing to show anything 
but fairly strong disapproval of the New Haven police, for the subject 
who had come to beheve that his autonomy in the face of a monetary 
lure was being assessed, remaining ‘anti police' would demonstrate that 
he had autonomy, for the subject who perceived an mdirect and dis 
ingenuous attempt to change his attitude and felt some reactive anger, 
holding fast to his original attitude could appear to be a relevant way 
of frustratmg the experimenter Furthermore, with each step of mcrease 
in reward we could expect an increase m the proportion of subjects 
who had been brought to a mohvatmg level of evaluation apprehension 
or affect arousal ” 

But such a remterpretation is merely another instance of applied ‘wise 
guyism” unless one attempts to put it to a close and demanding further 
experimental test To properly employ the altered rephcation strategy 
that I have already described, it was necessary to remove the posited 
evaluation apprehension dynamic, or at least to subdue it, and otherwise 
to hew as closely as possible to the design and operations of the onginal 
study 

How might the first of these desiderata best be implemented? The 
reinterpretation in terms of evaluation apprehension had an obvious 
methodological implication If the posited data biasing dynamic had 
actually occurred this had been made possible by the fact that the expen 
menter conducted both the dissonance arousal and subsequent attitude 
measurement For evaluation apprehension and negative affect, if tliey 
had been aroused in the high reward subjects, would have been focused 
upon the expenmenter, and it would have been either to avoid his 
negative evaluation or to frustrate him, or both, that the high reward 
subject would hold back (from the experimenter and possibly even from 
himself) any evidence that he had been influenced by the pro police 
arguments that he had elaborated m tl»e essay he had just completed 

Thus, quofang again from the onginal article (Rosenberg, 1965), these 
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an inverse, fimchon of the amount of monetary payment tha was offered 
to elicit the countenttitudinal advocacy Also it seemed likely to me 
that Cohen’s results could be due to an unsuspected arousal of evaluabon 
apprehension and a strong but imphcit, cueing which would have led 
most low dibsonance (le high reward) subjects to withhold evidence 
that they had influenced themselves m the pro-police direction 

Exactly what leads us toward this sort of interpretation of what really 
happened in this and similar early dissonance experiments on attitude 
change^ The answer can best be conveyed by some extended quotations 
from the ongmal article (Rosenberg, 1965) which posed the evaluation 
apprehension remterpretation of the Cohen study and then went on 
to report the altered replication by which that reinterpretation was 
tested 


“It seems quite conceivable that in certain dissonance expenments 
the use of surprisingly large monetary rewards for eliciting counteratb- 
bidmal arguments may seem quite strange to the subject, may suggest 
that he is bemg treated dismgenuously This in turn is likely to confirm 
inihal expectations that evaluation is somehow being undertaken As 
a result the typical subject, once exposed to this manipulabon, may 
be aroused to a comparatively high level of evaluation apprehension, 
and, guided by the figural fact that an excessive reward has been offered, 
he may be led to hypothesize that the experimental situation is one 
in which his autonomy, his honesty, his resoluteness m resisbng a special 
kind of bribe, are being tested Thus, given the pattemmg of their mibal 
expectations and the routmized cultural meanings of some of the mam 
features of the experimental situation, most low-dissonance subjects may 
come to reason somewhat as follows ‘they probably want to see whether 
getUng paid so much will affect my own attitude, whether it will mflu- 
ence me, whether I am the kind of person whose views can be changed 
by buying him off ’ 

■The subject who has formulated such a subjective hypothesis about 
the real purpose of the expenmental situation will be prone to resist 
giung evidence of attitude change for to do so would as he perceives 
It, convey something unattractive about himself, would lead to his bemg 
negatively evaluated by the expenmenter On the other hand, a similar 
hypothesis would be less likely to occur to the subject who is offered 
a smaller monelar>' reward and thus he would be less likely to resist 
giving evidence of attitude change ” 

On the basis of these speculative considerations I suggested, regarding 
Cohen’s experiment, that 
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considerations led us toward the basic alteration employed m out 
replication 

‘Tlic most effective \\a> then to eliminate the influence of the biasing 
factors Mould be to separate the dissonance arousal phase of the experi- 
ment from the attitude measurement phase The experiment should be 
organized so tb it it appears to tlie subject to be two separate, unrelated 
studies conducted by imestigators who have little or no relationship 
with cich other and who are pursuing different research interests In 
such a situation the evaluation apprehension and negative affect that 
arc focused upon the dissonance arousmg experimenter would probably 
be lessened and more important, they would not govern the subject’s 
responses to tlie attitude measunng experimenter and to the information 
that he seeVs from the subject ’ 

We need not tarry here over the details of the staging of the two- 
experiment disguise It will suffice to say that the disguise (judged by 
what the subjects said m quite probing postexpenmental interviews) 
worbed well, and that adaptations of it have since been used successfully 
both by others (eg, Carismith, Collins, and Helmreich, 1986) and in my 
own continuing research program on attitude change (Rosenberg 1968) 
Nor do we have to linger over precise descriptions of the instructions 
and measurement procedures used with the subjects Except for changes 
required by our use of the two experiment disguise all but two aspects 
of the procedure were identical with those used by Cohen in the original 
experiment The two deviations from the original experiment were neces- 
sitated b) the fact that it was conducted at Yale University and the 
altered rephcation at Ohio State University Thus in the second study 
^ale undergraduates did not serve as subjects and the issue for counter- 
nttitudmal advocacy could not be the same one employed at Yale 
Tlie issue that was used concerned the subjects’ attitudes toward a 
proposed ban upon any further participation by the O S U football team 
in the Rose Bowl contest Such a ban had been enacted, and later 
rescinded, by the faculty senate during the previous year and extreme 
student opposition had been expressed through demonstrations and some 
not hVe group activity 

Tlie expenmcntal subjects wole essays favonng the restoration of 
the Bose Bowl ban Tlie three expenmcntal groups wrote the essays 
for promised rewards {delivered after completion of the essay) of $50, 
$1 00 and $5 00 respectively A control group merely took the dependent 
variable measure a questionnaire on seven different campus issues, one 
of which was the Rose Bowl ban, while another dealt with the desirabil- 
ity of O S U abandoning its policy of giving athletic scholarships 
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On tlic Rose Bo\U issue tlit Kmslv-il-Walhs one wi) analysis of \an 
ance disclosed a significant rclilionslup {p < 001) and inspection shows 
this to be of the positue, inonoloiiic type the larger the financial reward 
for countcrattitudinal ads ocacy the gre iter the degree of attitude change 
(as estimated by comparison to the baseline attitude data provided by 
the control group) The $ 50 and $1 00 groups showed greater favorabil- 
ity toward the Rose Bowl ban than did tlic control group (p < 01) 
and less favorabihty than tlie $5 00 group (p < 02) “ 

A similar overall finding (p< 005) was obtained on the athletic 
scholarship issue, though the diircrcnccs bctivcen the groups were of 
lesser (but still significant) magnitude This finding was also predicted, 
and IS interpreted as evidence of some generalization of the mam attitude 
change elfcct to a related, antiathlctic issue 
Avoiding the lure of another theoretical area I hare so far said nothing 
about the substantive issues in this evpcrimcnt And I shall resist toe 
temptabon to do so now— cveept to note that the posibvc relationship 
obtained between degree of reward for counteratbtudinal advocacy 
and degree of resultant attitude change confirms the predicbon dravim 
from my own affective cognitive consistency theory and discon nm e 
prediction derived from dissonance theory But these issues of athtude 
theory need not bo examined here They are ™ ° 

my earlier publicahons (Rosenberg 1956, 1960a, 1960b, ) an 

a published debate bebvecn myself and Aronson, the latter writing a 
an advocate of a sophisticated, modified version of dissonance theory 
(Aronson, 1966, Rosenberg, 1966) , c j tUonr,, 

Before I turn away completely from the whirlpoo ° ® ® ^ 

around which I have been shirhng, 1 should hke to make clear to 
the controversy concemmg counteraltitudmal advocacy e ec s \ 
by any means, fully resolved on the basis of this one s X “ ’ 

new issues have since been discovered in this by now mi ® 
of theoretical debate, experiment and counter experimen u 
experiment disguise is now fairly standard in this parbeu ar resea 
area Also, the fact that under some conditions, at least, the incen 
rather than the “dissonance” relabonship does obtain is now credited 


• Tie probabihUes reported here as confirm.ng the tfuTefaptfr"^ 

in thrs stady are all b^ed upon the one taded f 

same convenhon has been employed whenever fteir 

predreted-though. as will be Ln most of the endrngs 

stahshcal sigraficance even if the more stringent u findings a 

standard were applied Within the tables summanzing value larger 

designation of N S (le. not sigmBcant) represents a probabihty value larger 

than 10, usually considerably larger 
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b) the mam participants in the persisting debate, though they continue 
to disagree (see the contnbutions of Jams, Carlsmith, Collins, Aronson, 
and Rosenberg in Abelson, et al , 1968) about the nature and provenance 
of those conditions 

Of greater pertinence at the moment are tvvo points that have nothing 
to do \Mth counterattitudinal advocacy as such, though they are grounded 
upon the Rose Bowl counterattitudinal advocacy study The results of 
the altered replication can be taken as at least an indirect demonstration 
of the possibility that evaluation apprehension is capable of inducing 
s)Stematic bias in experimental responding, and thus of generatmg un 
detected Type I or Type II errors (in the sense of mvahd confirmations 
and disconfirmations of hypotheses) The second point is that such bias 
effects need not remain undetected, nor need they be left in the realm 
of the merely suspected Variations of the altered replication strategy 
could probably be designed in most instances where an evaluation appre 
hension artifact is suspected to have induced systematic bias in the 
array of dependent variable data 

Inventiveness and care m the design of altered replications, and a 
readiness to resort to them frequently could probably do much to im- 
prove the reliability of the data that experimental psychologists collect 
to test hypotheses and in reaction to which they often develop new 
hypotheses 


Evaluation apprehension is by no means the only conceivable source 
of systematic biasing of data, nor is it an equally threatenmg possibility 
in all realms of psychological research But whenever our experiments 
arc heavy on surpnse and whenever the experimenter s purposes are 
likely to seem mjstcnous to subjects (or whenever subjects are likely 
to sense disingcnuousncss in the eepenmenter’s explanatory communica- 
on) we would do well to adopt the cautionary stance of obsessive 
roncom over the evaluation apprehension problem And havmg adopted 
this s ance, wc would do well to go beyond mere obsession or mere 
disputatiousness and get back to the laboratory where we can put our 
suspicions to test y conducting the relevantly redesigned altered 
replications ° 


An) one who resort to this strategy, however, had better be prepared 
0 find himself at the receiving end of the ironio justice process For 
the cnticized and their partisans can reverse the tactic on the asp, ring 
cntic An altered replication designed to remove a suspected evaluation 
apprehension contaminant from some previously reported expenment 
can, m itself, be interpreted as having been contaminated by evaluation 
apprehension or by some other biasing force (eg experimenter expec- 
tancy, demand characlenshcs, subject prcsensilizalion) 
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The mind reels, and one’s strength does quaver a bit, at the conceiv- 
able prospect of an infinite regress in which a study is designed to 
take systematic bias out of a study that was designed to take systematic 
bias out of a study that was . . . but in all likelihood, even fully con- 
sensual devotion to the expunging of evaluation apprehension effects 
will stop far short of such total, unlimited doubt. At some point it should 
become clear that particular experimental paradigms and particular sub- 
stantitive areas of experimental inquiry have been pretty thoroughly 
“debugged." And meanwhile, whatever temporary disruption, confusion, 
and outraged pride may result, the ultimate outcome can be trusted 
to be beneficial — not only in that it will probably elevate the trustworthi- 
ness of data in the contested area, but also because there is nothing 
more restorative of the scientific temper than an occasional encounter 
with the hard, intractable fact that one has made, and remains capable 
of making, mistakes. 


III. DEMONSTRATION THROUGH MANIPULATED 
AROUSAL AND CUEING 

The construct validation strategy, useful as it has been in theory test- 
ing generally and in our own research program, is really a version of 
the Platonic analogy of the cave. The shadows that are projected across 
the wall (i.e. our data) denote that something is passing between us 
and the sun — but we are still tantalizingly out of direct contact with 
its substance. 

Thus, the foregoing study, and others of similar design, though t ey 
seem to confirm the reality of the evaluation apprehension dynamic, 
do not bring us into direct contact with it. To look more closely at 
that process it appeared necessary to arouse it, rather than reduce it 
as was done in the Rose Bowl study. 

My first effort in diis direction was undertaken in an experiment m 
which I had valuable collaboration from Dr. Raymond Mulry who wm, 
at that time, one of my graduate students. Our working plan was simple, 
perhaps even crude. . 

Through a printed “Background Information Sheet we conveye o 
two separate experimental groups the following points: They were a out 
to participate in a study of social perception, in which they were to 
judge how much they liked or disliked various pictured persons. Past 
research by others, they were informed, had shown that lildng- is i ng 
reactions to strangers were correlated wth personality, particu ar y wit 
whether the rater was psychologically “mature or immature. o one 
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oxpcnmenta! group it was disclosed further that the mam burden of 
the past research (various invented journal articles were cited) vvas 
that psjchologicall) mature and healthy people show greater hkmg for 
strangers than do immature people To a second experimental group 
the printed communication conveyed the opposite past research had 
shown that it was psychologically immature and comparatively un- 
healthy persons who showed greater liking for strangers 

Beyond this crucial point of difference the two forms of the manipula- 
tive communication again converged All of the past research, it was 
asserted had been done with subjects in face to-face contact with real 
strangers Would the same relationship between psychological maturity 
and liking hold for mere photographs of unknown people^ This, the 
subjects were told vvas a question that we planned to pursue in further 
research But first it vvas necessary to ‘standardize” a set of photographs, 
to determine how much, on the average, they elicited likmg or dishkmg 
reactions Thus in the present study, according to the concluding para- 
graph of the Background Information Sheet, we were not testing the 
personalities of our subjects, rather, we were simply establishing norma- 
tive data against which we would later compare the liking disliking 
ratings elicited from subjects whose personality qualities had already 
been assessed 

The simplicity and directness of this manipulation make clear its in- 
tended purpose We were attempting to arouse evaluation apprehension 
by confirming for our subjects, the sort of expectancy that subjects 
often bring to experiments, namely, in the present instance, that as 
researchers we were ordinarily interested, among other thmgs, in per- 
sonality assessment And furthermore we were cueing our subjects about 
what past research had shown (and thus about the likely content of 
our own expectations) concerning the ways m which "mature” and “im- 
mature” people tended to react to strangers Why did we add that we 
had no idea whether the same relationship would hold with pictures 
wath reactions to directly encountered real persons? Partly to enhance 
the general credibility of our communication and partly to reduce what 
might olhervvase be a too overwhelming influence upon the individual’s 
judgments of the pictures Also this made the present study somewhat 
more comparable to many others in which the common strategy (what- 
ever has gone before) is to provide overt reassurance that the subjects’ 
personalities are not being scrutinized 
Apart from the two experimental groups (both aroused to evaluation 
apprehension and cued either toward hkmg or disliking responses respec 
tivcly) we also set up a control group These subjects received a bnef 
neutral communication which did nothing to arouse evaluation appre- 
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hension or to provide directional cueing. The data from this group 
served, then, as the baseline against which we could assess the signifi- 
cance of the deflections, toward the liking and disliking ends of the 
scale, of the experimental subjects* self-reported judgments. 

I have already suggested that the Rose Bowl, altered replication study 
could be interpreted as providing indirect evidence that evaluation ap- 
prehension can contaminate experimental data but not that it always 
or usually will. The same stricture is all the more applicable in limiting 
the meaning of the present study. Only a failure to find significant differ- 
ences between the mean liking ratings of the three groups could be 
taken as definitive; for this would mean that, even under optimum condi- 
tions, evaluation apprehension does not get aroused or, if aroused, does 
not affect experimental data. 

But if significant differences were obtained just what would they tell 
us? Merely that the data biasing dynamic that we suspect to be uninten- 
tionally induced in certain kinds of experiments can be intentionally 
induced by rather direct manipulation. In essence, then, we were giving 
ourselves a chance to increase the pertinence of the null hypothesis 
or provisionally to reject it. 

If the data seemed to allow the latter (i.e. if they showed that, at 
least by intentional amplification, evaluation apprehension can be made 
to affect experimental responding) we would also be in a position to 
carry out inquiry a few steps further. We would then be able to ask 
what kinds of people, situational definitions, and experimental tasks tend 
to facilitate or diminish the operation of the process in which data are 
systematically biased under the influence of evaluation apprehension. 

These foregoing considerations set the context in which we can now 
proceed to discuss the findings of the first evaluation apprehension ma- 
nipulative study; and they are equally relevant to the various other 
studies that followed it and employed the same basic design para igm. 
Obviously I would have no claim to write this chapter if the remits 
of this first study, and of the others that followed upon it, had failed to 
render the null hypothesis improbable. Thus there will be litt e surprise 
in the disclosure that in the first of these studies a large and signi cant 
difference was obtained between the two experimental groups. 

For the 12 pictures of male faces (each rated on a 21 point hkc-dislike 
scale ranging from -flO to —10) the algebraic sums of each subjects 
judgments were computed. The means of these scores ® 

groups of male and female subjects and the probabilities of the diiter- 
ences between various pairs of means, arc displayed in Tabic • 

For male subjects in the experimental group that was ouec to tun ' 
that mature people like strangers the mean algebraic sum of the ratings 
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TABLE I 


LiKE-Diii-rKE Mean Sums for Groups and Probabilities of 
Differences Between Groups 


Liking Disliking 

treatment Control treatment 


Males judging +23 20 — |p< OOOll— 

-11 25 

—I NS 1— -8-65 

1 


P< 0001 









+ 415 


n 










was 4*23 20 For the male subjects cued to think that immature people 
show greater liking for strangers the mean algebraic sum was — 8 65 
The significance of the difference between these groups (computed by 
Mann-\Vhitney Rank Sum statistic, as are most of the other simple 
differences between groups that are reported in this chapter) was clearly 
estibUshcd (p < 0001) 

Howc\er, as reference to Table I makes clear, we also encountered 
an interesting comphcation The disliking treatment did not, in fact, 
^ert a significant influence upon the male subjects who received it 
This IS apparent from the fact that the picture ratings from the un 
manipuhted control group are, on the average, just as negative (the 
mein is —11 25) as those from die dishking group and, of course, there 
IS no significant difference between these groups 

Does this si^ify that the disliking cueing that we employed was 
simpl> not credible? Or that, though credible, subjects could not bnng 
themscKcs to bcha\c in opposition to the normative standard (at least 
with typical middle class Americans) that whatever our private disposi- 
tion may be. strangers are to be approached with external affability? 

Either of these interpretations would be plausible if it were not for 
\anous other a\ailable findings The most sinking is that with the sepa 
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rate Rrouns of female subjects (judging pictures of males, it should 
be remembered) the disliking treatment does influence 
inEs and tlicy are as deviant from the control mean m tl.e nega ivc 
direction (p < .01) as the mean for the liking treatment is m the positive 

'‘FuSrelm.'?ven with the male subjects, ‘'’f- " 

that a personality-linked variable has mediated the ^ 

bkins treatment. For all subjects we had available their scores on the 

Marlmve-Croivne (1961) Social Desirability (^D) Scale which liad b 

administered sometime before the present S^ial 

When the male subjects are split into high and low halves on the 

Desimbility Seale f^bution we i^nd tba a 

direction is visible between the High SO j that trend 

disliking groups respectively. ^ °o) the eounterhypothe- 
is reversed and approaches significance (p < ■ ' ctroneer than 

sis direction. If tL latter group had '='’°r,.\‘rould have supported 
that obtained from the former the overa " “S statistical significance, 
the predicted relationship at an acceptab e -rinroval from others 

Thus it is the Low SD males who, needing less ^^PP ^e to reload 
(and, we may assume, from the experimenter) are repre- 

against the normative grain and win a judgm ^ 

seating themselves as disliking certam strange 

There is a glimmer of a P-''”'^.'’* “ ovIl’(*e ffigh^^D 

one would expect people with a “^ity for representing them- 
scorers) also to show a more persisting p V ^ passing 

selves as positively disposed toward randorn die^ocial 

beyond the data from the disliking ' ’ within the experi- 

deLabiUty factor did exert the expected influence within the ^p^^ 

mental group that was cued to believe a , ® jjtion give more 
of maturity. Ve High SD male subjects m if +34.77) 

extreme liking scores (the mean sum of their P'^ +13.72). 

than do the comparable Low SD subjects (w os 

The difference between these groups is sign can IP discussed above. 
Despite the few tantalizing ambi^ihes quite clear; 

the overall import of this first manipu a v . ^ , subsequent direc- 
with intentional arousal of evaluation appre subiects’ experimental 

tional cueing does “take -that is. « our telbVe satisfac- 

responding. Postexperimental “'Th'D' ^ comprehension of our 

tion that these results were not due / , ' . , . A],g preliminary 

unrevealed purposes. The subjects oTrLcUons to 

material that ffiey read concerning “earher studies 
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strangers had not particularly mnuenced them I do not take these re- 
ports as veridical, but neither do I think that they are ^ ^ 

mtention to deceive the experimenter From interviewing conducted after 
data collection m this study and others I have formed the 
that subiects will usuall) obscure from themselves the extent to which 
they regulate their responding so as to win favorable judgments from 
the experimenter And though I cannot anchor the following judgment 
on a base of hard data I would hazard the psychologically obvious 
interpretation that this sort of motivated inattention is due to ‘ypical 
subjects need to conserve a positive image of himself even as he halt- 
knowingly seeks to make a positiie impression upon the expenmenter 
Upon completion of this first demonstration study it would have been 
possible to plunge directly into studies concerned with vanables that 
facihtate or suppress the evaluation apprehension data biasmg process 
But the rating of pictures for their likcabihty is a rather special sort 
of task and, as we have seen, cerlam complications did arise on the 
dislike cueing side of the experiment To satisfy ourselves that the pro- 
cess under study was a fairly general one, it seemed necessary to adapt 
the basic experimental paradigm to some other and quite different sorts 


of expenmental tasks 

Two further studies of this type were successfully earned out with 
male undergraduate subjects at Dartmouth College I shall describe them 
somewhat more bnefly than the preceding study, since they are useful 
here only in adding some empincal weight to an assertion that I have 
already registered more than once i e that evaluation apprehension 
combined with some hints about how 'normal people’ react (and thus 
implying something about how the experimenter’s approval can be ob- 
tained) does exert systematic biasing influence upon experimental 
responding 


In one of these additional studies I was joined by two comvestigators, 
Philip Corsi (who developed the basic experimental design and opera- 
tions) and Edward Holmes, both of whom were advanced undergrad 
uate students in the Psychology Department of Dartmouth College We 
used an extremely simple task the subject taps upon a key with his 
right and left index fingers for six separate ten second intervals half 
of these with one finger and half with the other The number of taps 
IS automatically registered on a Veeder Root meter Normally there 
is a considerable discrepancy between the performance of the two 
fingers, the index finger of the dominant hand producing more taps 
than that of the nondominant hand 


As the subject entered the experimental room he was asked by the 
experimenter “Did you take the general abilities test and the personaUty 
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inventory during freshman weekt^’ The purpose of this query was to 
stir some initial prompting toward evaluation apprehension 

Following the administration of three brief abilities tests focused on 
verbal and symbolic skills (and intended to rouse the subjects interest 
m their performances) the experimenter proceeded to give a memorized, 
verbal explanation of the linger tapping task For the control subjects 
(eleven right handed undergraduates) this consisted only of a simple 
description of the task Working with only this information these subjecte 
did, indeed, produce more taps with their right than with their left 
index fingers The mean difference between the sum of right index taps 
minus the sum of left mdex taps was 22 45 for this group , r i j 
W ith an experimental group of the same number of ng t ® 
subjects the preliminary communication contained some additiona in or 
matron designed to heighten evaluation apprehension and to turn i 
m a particular response direction Thus they were told that recen re 
search with graduate students at Yale and at the University o 
had turned up the surprising findmg that the number of taps wi e 
nondominant index finger was virtually equal to the number vvi 
dominant mdex finger The clear implication was that peop e wi ig 
intelligence (or perhaps of higher educational attamment) per orme 
differently than did other, more ordinary, persons 

The result was striking The mean difference between Ae sums 
of nght and left index finger taps was only 10 73 an is q 

significantly (p < 005) different from the comparable score of 22 45 

obtained with the control group A clear hint about the re a ion 
performance on the experimental task and the likely evaluat on *at 

the experimenter would draw from the subject s per r j 

duced a “transcendence effect” The experimental su jec s P 
far more efficiently with the left index finger ^an subject (both 
oivn control subjects and those in many other stu les) or in 

One further finding from this study is of particular -ntcrcst The con 
trol and expenmental data just described were obtained ^ 

feedback” cLdition, that is, the meter registering the ™ 

of taps was visible to each subject as he xvcnt f ougl the epara e 
nght and left finger tapping trials But on six other 
same subjects we used a “partial feedback condition ’o ' 
was exposed only after the ten second tapping intcn a 
plcted (For half of the subjects the partial feedback ‘n»' j 
the Ollier half the full feedback tnals, were run first 
feedback condition the same sort of difference is o ^ . qjj 

expenmental and control groups but it is of lesser fogn „Z„nond- 
and tile difference between the means for the two groups is P® 
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maly snnUer It seems clear, then, that the continuous availability of 
information about the quality of his performance will assist *<= 
ui guiding that performance in the direction that he expects will bring 
him a favorable evaluation 

A more ambitious study was undertaken at the same time Inis one 
followed our onginal experimental paradigm (as used in the picture-rat- 
mg study) in all its basic details, except for using a quite different 
sort of expenmental task 

In Uvo basic expenmental conditions and in a separate control condi- 
tion the subjects (20 in each group) worked with a booklet of addition 
problems, each page of which had the same number of total digits 
The booklet was far too lengthy for completion and the subject was 
free to stop whenever he chose to do so Each fourth page in the booklet 
presented some scales from the Nowhs (1965) Mood Adjective Check 
List which the subject was to fill out before he went on to the next 
page of addition problems 

Agam the expenment began with the subject reading a Background 
Information Sheet This document, modeled after the one used in the 
onginal picture rating study, explamed that past research by other inves- 
tigators had revealed a relationship between personahty competence 
and the amount of efficiency shown, and pleasure experienced, in execut- 
ing dull, routine tasks One form of the information sheet went on to 
report that the clear finding from many studies ( agam fictional citations 
were given) was that the mature, psychologically healthy person experi 
enced more pleasure and was more efficient at such tasks than was 
the immature person 

There followed a paragraph explammg the probable psychological 
basis of this finding (le emotionally healthy people are less prone to 
distraction and enjoy challenges more than emotionally handicapped 
persons) The text then went on to explain that all of the previous 
studies had dealt with "motonc and manual’ tasks and there was no 
proof that the same relationship would hold true with clerical or other 
paper and pencil tasks 'This, it was explamed, was the question that 
would be addressed m later research that the experimenters planned, 
but first It was necessary to develop a standard task and to determine 
what the typical levels of performance would be Tbus the subject was 
about to participate m a pilot study focused not upon his personahty 
but rather upon determining average performance levels for various 
kmds of clerical tasks — and at present the task whose utility was to 
be explored was arithmetic addition 

The content of the Background Information Sheet used with the other 
expenmental group was, of course, exactly parallel m form but opposite 
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in content It reported that past data with routine moloiic performances 
had shown that psychologically healthy and mature persons were less 
efiBcient at carrying out such tasks and got less pleasure from them 
than did psychologically immature persons Again a bnef psychological 
explanation of the basis for this finding was offered and this was followed 
by exactly the same further comments that were used with the first 


version of the mampulative commumcabon 

Obviously this manipulation, just as the one used in the picture rating 
study, was likely to exert a strong force toward arousal of evaluation 
apprehension and at the same time would provide unambiguous cues 
that could be used to regulate experimental respondmg so as to maximize 
the chance that one would be judged ‘normal by the psychologica 
experimenter However, despite the directness of the manipulation, post 
experimental questionnaires and postexpenmental interviewing wit a 
sample of the subjects revealed little acknowledged penetrabon of the 
purpose of the expenment Subjects did show accurate recal ^ 

content of the Background Informabon Sheet but usually they insiste 
that they did not feel that their personalities were being scnibnizecl, 
instead, they reported that they had simply worked on the problems 
unbl they got bored or fatigued 

But, though evaluation apprehension or concern for per orming in 
the “normal” way typically was not acknowledged to the mtemewer 
(and, possibly, not fully acknowledged to the self) it did clear y in uence 
the actual performances of the subjects on the anthmetic addihon task 
That this IS the case is clear from the Bndings presented in fable 
II The means reported there are from the hvo experimental groups 
The table also displays means from a control group w ose mem 
worked on the addition problems without any previous arousal ot eval - 
bon apprehension, thus estabhshing the baselme per ormance 
against which the experimental groups can be judged 

It IS clear that the bvo expenmenlal groups differ from one another 
m the predicted direcbon On the average, the su jec w ^o 
to believe that mature people tend to be comparative y ysp 
mept toward routine tasks completed ten less addition pro ° ,j„ 

did the opposite experimental group (p < 03) Similarly, ey 
rolved eleven fewer problems than did the other expenmenlal group 

However, examination of Table II xviU quickly suggest that the • 
blent emphasizing that mature people do not perfom «c on ‘ 

"■as a far stronger influence upon the performance dian "bs 
treatment While the former expcnmcntal group ditfcis signiflc. ) 
the control group on botli the number of problems completed and 
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mely smaUer It seems clear, then, that the continuous availability of 
mfoLation about the quality of his performance vvill assis *0 
m guiding that performance m the direction that he expects will bring 

him a favorable evaluation . rm, - 

A more ambitious study was undertaken at the same time This on 
followed our onginal experimental paradigm (as used in the pictoe-rat- 
ing study) m all its basic details, except for using a quite diBerent 
sort of experimental task 

In two basic expenmental conditions and m a separate control condi- 
tion the subjects (20 m each group) worked with a booklet of addition 
problems, each page of which had the same number of total digits 
The booklet was far too lengthy for completion and the subject was 
free to stop whenever he chose to do so Each fourth page in the booklet 
presented some scales from the Nowhs (1965) Mood Adjective Check 
List which the subject was to fill out before he went on to the next 
page of addibon problems 

Again the expenment began with the subject reading a Background 
Information Sheet This document, modeled after the one used in the 
ongmal picture ratmg study, explamed that past research by other inves- 
tigators had revealed a relationship between personahty competence 
and the amount of efficiency shown, and pleasure experienced, m execut- 
ing dull, routine tasks One form of the information sheet went on to 
report that the clear finding from many studies ( agam fictional citations 
were given) was that the mature, psychologically healthy person experi- 
enced more pleasure and was more efficient at such tasks than was 
the immature person 

There followed a paragraph explaming the probable psychological 
basis of this finding ( i e emotionally healthy people are less prone to 
distraction and enjoy challenges more than emotionally handicapped 
persons) The text then went on to explain that all of the previous 
studies had dealt with motoric and manual’ tasks and there was no 


proof that the same relationship would hold true with clerical or other 
paper and pencil tasks This, it was explamed, was the question that 
would be addressed m later research that the experimenters planned, 
but first it was necessary to develop a standard task and to determine 
what the typical levels of performance would be Thus the subject was 
about to parbcipate in a pilot study focused not upon his personahty 
but rather upon determining average performance levels for various 
kmds of clerical tasks — and at present the task whose utihty was to 
be explored was anthmetic addibon 

The content of the Background Information Sheet used with the other 
expenmental group was, of course, exactly parallel in form but opposite 
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m content It reported that past data with routine motoiie performances 
had shown that psychologically healthy and mature persons were less 
efficient at carrying out such tasks and got less p easme ri^ cm 
than did psychologically immature persons Agam a ne psyc o ogica 
explanation of the basis for this finding was offered and this was Mowed 
by exactly the same further comments that were use wi e 
version of the mampulative communication 

Obviously this manipulation, just as the one used m the picture ra g 
study, was likely to exert a strong force toward arousa o 
apprehension and at the same time would provide “''-■"biguous cues 
that could be used to regulate experimental responding so as « maximize 
the chance that one would be judged ‘norma by P ^^“1 
experimenter However, despite the directness “ o Tprviewine with a 
expenmental questionnaires and postexpenmen a 
sample of the subjects revealed little acknow e ge p 

puipose of the expenment ^‘‘’g^gg^buTusuall^ they insisted 

content of the Background Information Shee j-p.nj, scrutmized 

that they did not ffel that ^eir personalities 

instead, they reported that they had simp y 

unhl they got bored or fatigued concern for performing in 

But, though evaluation apprehension o mterviewer 

fte “normal way typically did clearly influence 

(and. possibly, not fully “buowMged to ^self 

the actual performances of the subjects o „ , nresented in Table 
That J is the case is clear fXfthetTex’p-mental groups 
n The means reported there are fro * membeis 

The table also displays means from a „„„,ous arousal of evalua- 
worked on the addition problems "'‘*b‘’“ performance levels 

tion apprehension, thus establishing Hped 

against which the expenmental groups can e ju S another 

It IS clear that the two expenmcnta .ybiects who were led 

m the predicted direction On civ dysphonc and 

to believe that mature people tend to e problems than 

inept toward routine taslvs completed c 5 iniihrly, they correctly 

did the opposite experimental group (P < ' e,tpcnmental group 

solved eleven fewer problems than d 

(p < 003) ,, 11 . ciipccst lint ibc treat- 

Honever, examination of Tabic II -11 on dull tasVs 

ment emphasizing that mature people do no p opposite 

''as a far stronger influence upon the per differs significanll) from 

treatment Wliilc tlic former expenmcnta gro p completed and the 
the control group on botli the number of problems i 
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TABLE II 

PEIIFORMA'.CF MEANS FOR GhOUPS, AND PnODADlLITIES OF 
DlFFERENCtS BeTW’EEN GrOUPS 


Pleasure and 

efficiency 

treatment 

Mean Number i j j 

of problems 57 96 — 1 N S 
completed * j — * 


Displeasure and 

inefficiency 

treatment 


Mean number t 1 . 

of problems I 43 39 — j N S 
correct I — I 


41-00 l- |p<-Q05V -| 32-25 


Mean 
efficiency 
index 

No correct 
No complete 

number solved, the latter group does not An obvious proposition vir- 
tually suggests itself and it is one that may well deserve an important 
place in the general theory of evaluation apprehension processes that 
IS emergmg as we pursue our expenmental program Simply stated it 
IS this cues suggesting a response pattern that is hkely to bring approval 
from the experimenter will have stronger influence upon actual respond- 
ing when that pattern is also less effortful m execution 
There is, of course, an alternative interpretation that is quite plausible 
m the present instance the “displeasure and ineflBciency” version of 
the Background Information Sheet may simply have been more credible, 
more in accord with the initial expectations of the subjects After all 
it does seem hkely, at a common sense level, that only “odd” people 
will enjoy routine, repetitious tasks But this interpretation is weakened 
by the fact that, by a somewhat subtler analysis, we find that the subjects 
m the opposite expenmental group were also influenced in their perfor- 
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mances on the addition task. Apart from companng the groups on the 
total number of problems completed and correctly completed we went 
on to compute for each subject an index based upon the ratio between 
these two separate scores Dividing the number of problems correctly 
solved by the number completed we obtain a meaningful estimate of 
the quality of the subject’s performance in relation to the scope of that 
performance. 

As Table II shows, on this index the experimental groups are again 
significantly different (p < .03) from one another in tlie predicted direc- 
tion. However in this instance the group cued to the expectation that 
mature people perform poorly does not differ from the control group. 
Though they are completing fewer problems, the percent of these that 
are correctly completed is the same as with the control group But t e 
opposite experimental group does differ from the control group n 
the average the subjects in this experimental group compete a ew 
less problems and solve a few more. In consequence the difference be- 
tween the control group and the “pleasure and efficiency 
attains significance (p < .05). Clearly these experimental subjects have 
been putting somewhat more effort into the task; they have been concen 
trating more closely on the truly tiresome task of adding co umns o 
digits, and in consequence they have attained somew at grea er 


accuracy. « . . 

Equally interesting and meaningful in the light of e " 1^® 

reported, are some additional findings obtained with the Mood Adjective 
Check List. . . j c 

Avoiding the task of describing the scoring or analytica proce » 
I shall content myself here xvith simply reporting that on . 

subscales of this instrument we find the experimenta , erkmo 

either significantly or at borderline levels from one ano ’ ^"rrrnnn 

inslancel from the control group as well. The subjects m the poup 
cued to think that normal people enjoy routine ^ than 

selves as feeling less dysphoric while doing the addition pro • 

those cued to think that normal people do not enjoj em. r 
characterizations tend to persist across the various interva s (cv 7 
page in the addition problem booklet) at which the su jec 

quired to report upon their mood slates, , . 

I have reviewed three manipulative studies each o -mmilal rc- 

demonstrated our basic point: that systematic bias m nnnrchcn- 

sponding can be produced through the arousal of c\a ‘ P foster 
Sion and the cueing of particular response patterns as like ) 

positive evaluation. , _„,i »},pv 

However, two defects of this group of studies arc appa 
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should be noted here The first is, simply, that they do not cover as 
broad a range of experimental tashs as could be desired Suspecting 
that virtinlly any type of experimental performance could be systemati- 
cally biased bv the evaluation apprehension process, we might well 
liave gone on to similar studies m such diverse areas as conditioning 
and other learmng phenomena, psychophysical judgment, impression 
formation concept formation, and many odier areas Particularly I should 
have liked to test the proposition diat the degree of attitude change 
shown by subjects (m their responses to questionnaires administered 
after a persuasive communication has been received) will be mfluenced 
by the prior suggestion that attitude change reflects a mature quahty 
of “openmmdedness” or an immature quahty of “inconstancy 
Further work along some of these lines is planned But, happily, the 
task of demonstrating the broad relevance of evaluation apprehension 
as a data biasing process has now been taken up by some other investiga- 
tors By rather different experimental techniques than those that I have 
employed, Silverman (1968, Silverman and Regula, 1968) has been pro- 
vndmg some evidence that could be easily fitted to the general picture 
developed here And Sigall, Aronson, and Van Hoose (1968) have re- 
cently reported a study in which subjects are exposed to evaluation 
apprehension cueing and also to the “demand characteristic” of the ex- 
perimenter's expectancy about their performances, the latter ostensibly 
based upon the scienbfic hypodiesis he is testing With subjects for 
whom both forces converge, suggesting that a certain mode of respond 
mg will prove the expenmenters hypothesis and also make the subject 
appear a competent and adequate personality, strong influence on expen 
mental responding is obtained With another group of subjects these 
forces are made to diverge, so that the subject must violate the expen- 
menters hypothesis if, as he sees it, he is to appear competent and 
psychologically adequate The typical subject yields to the latter rather 
than the former force Thus, even with a strong demand charactenstic 
opposmg it, the evaluation apprehension dynamic is found to exert a 
statistically significant influence upon the expenmental responding of 
the subjects 

Interesting and heartening as such studies are, much more experimen- 
tal exploration will be required before we can take as established the 
claim that the systematic biasing of data through the evaluation appre- 
hension dynamic is a general phenomenon, one that can be made to 
occur over the vast range of response dimensions with which modem 
expenmental psycholo^ is concerned My expectation is that such a 
program of parametnc exploratory’ studies would m fact reveal consid- 
erable generality of this sort At the same time it would probably disclose 
that certain ty-pes of expenmental responding are more prone, and others 
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more resistant, to this type of systematic biasing. Indeed, I think it 
likely that one would also find that, within a given behaviorial realm, 
certain directions of responding are more easily affected by evaluation 
apprehension pressures than are others. This has already become appar- 
ent through our discoveries that liking of strangers or inefficient per- 
formance on routine tasks are more readily inducible response patterns 
than are their opposites. 

A momentary lapse into unrestrained programmatic fantasy (an easy 
indulgence if one puts aside the fact that someone must actually under- 
take the vast labors that are contemplated) suggests the desirability of 
constructing, through empirical techniques, a sort of evaluation appre- 
hension atlas of response dimensions. The hundreds of types of elicited 
behaviors which now serve as dependent variables in psychological re- 
search could be separately submitted to evaluation apprehension cueing 
of the sort employed in our demonstration experiments. The degree 
of influenceability of each particular response pattern (and of separate 
response directions) could then be assessed. Ideally, this would need 
to be done with systematic variation in types of subjects, types of eva ua- 
tion apprehension arousal, and types of directional cueing. The resu t 
would probably have high payoff in terms of increasing our ability to 
do uncontaminated, bias-free research — or at least to come closer in 


approaching that utopian state of affairs. 

I said earlier that I perceive two main defects in the group of demon- 
stration studies described here. The first, as discussed above, can be 
handled only by doing more demonstration (and parametric exp ora ion J 
studies over the broad range of common dependent varia es emp oye 
in psychological research. The second defect is one that 
the way in which such further studies might be conducted. What 
have in mind is the fact that in all of the foregoing studies the manipu a- 
tion had two separate components: evaluation apprehension was 
or heightened by our telling the subjects, in a fairly irect way, 
the responses they were about to make would have some ^ ^ 
significance concerning their own personalities; then, in a separa ® 
subsequent portion of the communication, some hints ^ , 

strong ones) were given concerning the response differences 
he expected as between normal and abnormal or mature an 
ture” persons. 

Are both portions of the induction required? For that p 

subtler inductions be used without the loss of the systema ic i 
These questions point up a basic limit in the group o ^nnnirli 

studies so far renewed: namely, that they have not featured enough 
cross experiment systematic variation in ways of m ucing * ‘ , 

apprehension. \Vhen such variation is attempted w a are 
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should be noted here Tire first .s. simply, that they do not cover a, 
broad a range of experimental tashs as could be desired Suspecting 
that virtually an> type of expenmental performance could be systema - 
callv biased bv the evaluation apprehension process, we might well 
liave gone on to similar sUidies in such diverse areas as conditioning 
and other learning phenomena, psychophysical judgment, impression 
formation concept formation, and many other areas Particularly I should 
have libed to test the proposition that the degree of attitude change 
shown by subjects (m their responses to questionnaires administered 
after a persuasive communication has been received) will be influence 
by the pnor suggestion that attitude change reflects a mature quahty 
of "openmindedness” or an immature quahty of “inconstancy 

Further work along some of these Imes is planned But, happily, the 
task of demonstrating the broad relevance of evaluation apprehension 
as a data biasing process has now been taken up by some other investiga- 
tors By rather different experimental techniques than those that I have 
employed Silverman (1968, Silverman and Regula, 1968) has been pro- 
viding some evidence that could be easily fitted to the general picture 
developed here And Sigall, Aronson, and Van Hoose (1968) have re- 
cently reported a study in which subjects are exposed to evaluation 
apprehension cueing and also to the "demand characteristic” of the ex- 
penmenter's expectancy about their performances, the latter ostensibly 
based upon the scientific hypothesis he is testing With subjects for 
whom both forces converge, suggesting that a certain mode of respond 
ing will prove the expenmenler’s hypothesis and also make the subject 
appear a competent and adequate personality, strong influence on experi- 
mental responding is obtained With another group of subjects these 
forces are made to diverge, so that the subject must violate the expen- 
menters hypothesis if, as he sees it, he is to appear competent and 
psychologically adequate The typical subject yields to the latter rather 
than the former force Thus, even with a strong demand characteristic 
opposmg it, the evaluation apprehension dynamic is found to exert a 
statistically significant influence upon the experimental responding of 
tlie subjects 

Interesting and heartening as such studies are, much more expenmen 
tal exploration will be required before we can take as estabhshed the 


claim that the systematic biasmg of data through the evaluation appre- 
hension dynamic is a general phenomenon, one that can be made to 
occur over the vast range of resj^nse dimensions with which modem 
experimental psychology is concerned My expectation is that such n 
program of paramelnc exploratory studies would in fact reveal consid- 
erable generality of this sort At the same time it would probably disclose 
that certain types of experimental responding are more prone, and others 
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which do the work of our Bichground Information Sheet (le arousmg 
general evaluation apprehension, or cueing the subject in a particular 
response direction, or both) far less cvphcitl), uith more natural m- 
directness I am fairly sure tint the answer is yes — that such subtler 
manipulation will induce s)stcmatic bns in experimental responding 
To this purpose, a number of further demonstration studies have been 
planned, but not vet executed If their results are successful they will 
give us a stronger empirical basis than we have yet established, for 
the claim that the evaluation apprehension dynamic does often operate 
where it is usually unsuspected e g in expenments undertaken to test 
substantive issues and hypotheses relevant to important matters of psy 
chological theory But, if this point is not yet fully established through 
our demonstration efforts I must, nevertheless, confess that the studies 
we have already completed (both those described above and those that 
follow in the next sections) have considerably strengthened my own 
original suspicion, namely Evaluation apprehension does contamm^e 
a fair portion of the expenmcntal work now being conducted over the 
broad range from social psychology to psychophysics 

To be sure, as I make this declaration I am mindful of vanous 
considerations some of my readers will certamly think it a considerable 
leap beyond the data— and they are right, but scientific inquiry, like 
other more muscular pursuits, is advanced by the judicious use o au ac 
Jty Also I am mindful that this sort of (accuse, as it concerns any 
experiment m which one suspects that evaluation apprehension has dis 
torted the data, cannot be sustained by a hundred, let alone three, dem 
omtration expenments, instead the logic of inquiry forces us bacK to 
^0 necessity for undertaking carefully designed altere rep ic 
studies 

However, the more we can lean, about evaluation 
through intentionally arousmg it, the better equipped we wi 
search it out and bnng it to heel through the altered replication str ^ 
Thus, m further research I and my colleagues have gone on creatmg 
eraluation apprehension and expanding our mquiry to enco P 
^‘diary vanables which may work to heighten or reduce 
“Pon expenmcntal responding I shall now turn to a review and discus 
Sion of some of these further studies 

rv. VARIABLES INFLUENCING THE EVALUATION 
APPREHENSION PROCESS 

Though all of our onginal demonstraUon studies "[‘veil 

'Sects, there was a fair decree of mtersubject vanance within, as 
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to fmd^ Bo’-h through '^pccuHhve rumination and also in the light of 
some of the data from additional studies that I shall shortly discuss 
I am wiling to hazard some informed guesses The first is that the 
ev uii ition tppiehcnsion biasing effect does not depend upon providing 
tie subjects With an initial statement defining the experimenter as one 
who is mteresttd m th^ studv of personality or who is otherwise sensitive 
to the p rsonalitv revealing implications of the data he is collecting 
When this is done it probabl) does boost the data biasing process but 
the same sort of process is likely to be set m motion merely by providing 
some cues suggesting that one mode of responding as compared to an 
other IS more “nomial or competent or mature The latter strategy 
was the one employed by Sigall Aronson and Van Hoose (1968) and 
it was sufficient to induce significant systematic bias 
However what of the situation in which no direct cueing toward 
the normal pattern of response is provided^ Surely this is the typical 
state of affairs in experiments m which evaluation apprehension is an 
inadvertent rather than an intended influence upon subjects responding 
Theoretical analysis has rather persuaded me (and some studies re 
ported later on the mediation of the experimenter expectancy effect 
have tinned persuasion toward conviction) of this basic point arousmg 
t c su jeet to the general expectancy that his personality competence 
J%ill be available for judgment by the psychological experimenter sets 
him examining salient aspects of the situation for what they might reveal 
a ou le way a normal person would respond In other words when 
/.fH evaluation apprehension has been aroused by intention 

tinn CL ^ ^ first portion of the Background Informa 

mil I communication) or unintentionally direct cueing of the nor 
bo ntw" behavioral model ,s not required Subtler hints will 
iccts— anH In tT? bype'^beses will be formulated by the sub 

hints and dnw tl separate subjects attend to the same 

data will bn 111 r ‘"terprelahons systematic biasing of response 
lurkinp in thn^LoVf” ^ implied methodological corrective is 

nmnt Udes ^ at a later 

am stimntrs of oval iteration here techniques for reducing 

ns\rho1nfTimt inning any initial concern that their 

brnd tf ’ T "■'! be open to judgment are 

Xenment "^‘'torthmess of the data colleetfd in that 

"’i” d'scussion of the limits of our 

mnmpuhtion techniques I ashed can subtler inductions be used without 
lc«s of the sjstemalic bias effects’ By ‘suhller- I mean communications 
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haps those with strong approval need have in the past been more con- 
cerned with (and, thus, more rewarded for) seeking social acceptance 
through overt display of competence, or perhaps, because of their greater 
general concern with techniques of ingratiation, they find the cueing 
communication less credible than do the subjects who are comparatively 
low m the need for social approval 

The personality attribute of need for approval, as indexed by SD 
scores, was studied within the context of our original demonstration 
studies Our more directly focused work on booster and suppressor 
effects began with two studies dealing with another variable the defini- 
tion of the experimenter or the evpenmental situation as high or compara- 
tively low in clinical interest In effect this can be viewed as a simple 
“additive” variable i e , another direct force making for evaluation ap 
prehension in the subject and thus, m our typical experimental situation, 
addmg to the weight of forces that have already served to create that 
motivating state 

In the first of our studies on the clinical interest vanable, the same 
sort of Background Information Sheet that was used in our earlier studies 
was read by all experimental subjects Once again its first portion was 
designed to rouse general evaluation apprehension while its second por 
tion conveyed the clear hint for one group that mature people are com- 
paratively high, on liking for strangers, and for the other group, on 
dishking for strangers Control subjects received no such preliminary 
communication So far this study is essentially a replication of our origi- 
nal demonstration expenment The additional vanable was mtroduced 
through a few memorized sentences which the expenmenter addressed 
to the subject 

The expenmenters (senior students in an advanced experimental psy 
chology course at Dartmouth College) each ran six subjects two who 
had received the hking cuemg, two the disliking cueing, and two control 
subjects To one subject from each of these three categones the expen 
menter represented himself as having a ‘chnicar onentation He did 
this by saying, just before presenting the instructions for the picture 
ratmg task, that he was quite fascinated by the expenment as, indeed, 
he was by psychology generally *I guess,” he continued, thats because 
I’m always thinking about what makes people tick That’s why I’m 
hoping to go mto psychiatry after I finish medical school ” 

In the opposite ‘nonclinical’ role that he played with his three other 
subjects each expenmenter said that he did not particularly see the 
importance of the present expenment He contmued ‘For that matter 
I’m not sure what I’m doing in this course but they said, at the School 
of Engineenng, that I had to take it ” 
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as between conditions This suggested that nncontrolled factors relating 
to the subjects’ personalities, their sensitivities to aspects of the situation, 
and their patterns of past expenence as subjects might be affecting 
how much evaluation apprehension they felt and how they were acting 
to reduce it 

Clearly a host of variables might be found to influence the evaluation 
apprehension data biasing process — and the direction of such influence 
might be either to facilitate or subdue the overall operation of the pro- 
cess I found it useful to conceive such “booster” and “suppressor” van- 
ables as falling mto five major categones They could be personality 
attributes (or overall personality patterns) of the subject, aspects of 
the subject’s recent, preexpenmental expenence, aspects or attnbutes 
of the experimenter, or of the experimental setting, or of the experi- 
mental task We need not think of this taxonomy as the most logical 
of all possible ones, nor need we assert that it would incorporate all 
relevant variables Its mam value was, simply, that it was enough to 
get us started 


But we are just barely started on this line of inquiry While many 
relevant variables are easily conceivable, only four major ones have 
been investigated in specific experiments The results which I shall 
shortly present have been quite informative both m confirmmg our initial 
hypotheses and also, in two of these studies, by disclosing certain more 
complex interactions which have, m turn, suggested some new lines 
of theoretical speculation 

The four variables upon which this work has so far focused are 

e need for approval as an attnbute of personality, the salience of 
tec meal orientation as an attnbute of the expenmenter or of the 
experimental setting, the experimenters “gate-keeper” power over the 
subject, and the ambiguity of the expenmental stimulus materials 

ave a ea y reported that the need for approval (as indexed by 
scores on t e ocia Desirability Scale) seemed to play a response affect- 
ing role in the first of our demonstration studies The same appeared 
to be true in the study reported above m which "efficient” and “ineffi- 
cien per ormance on routine addition problems were separately cued 
as reBechng personality competence In this instance we fonnd that 
under the cuemg treatment suggesting that bored and inefficient per- 
formance on roubne tashs is a correlate of “maturity," subjects with 
high (upper 50 per cent) Social Desirability scores did not differ from 
the control poup However those with low Social Desirability scores 
completed and correctly solved significantly fewer problems This finding 
though unexpected IS a meanmgtnl one. m the sense that hindsight 
suggests some hypotheses that deserve further experimental testmg per- 
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the liking-cued groups, the information that liking of strangers is a sign 
of maturity was congruent with their initial response disposition. But 
the identification of the experimenter as having a special interest in 
“what makes people tick” and in “psychiatry” operated to raise the stakes 
for the subjects run by the clinical experimenter. To guarantee the win- 
ning of a positive evaluation from him the typical subject in this group 
strives to give extreme, and thus unambiguous, proof that he possesses 
the defined hallmark of the mature, psychologically healthy person. 

These are, then, unexceptional data. They seem to confirm the obvious 
and predicted relationship between the aroused strength of evaluation 
apprehension (the clinical definition having served to increase it, and 
the nonclinical definition to decrease it) and the degree to which the 
subject relies upon a response style he believes to be indicative of normal 
or attractive personality. 

But, persisting in the mood of parametric exploration rather than 
theoretical expansion, the following simple question might be asked: 
Must the experimenter directly define himself as having a special interest 
or ability in evaluating personalities? Or can the same sort of evaluation 
atmosphere be induced by other means? One additional experiment 
seemed to confirm the latter possibility. 

In this study we employed the same Background Information Sheet 
as in the previous one. By this means we again provided both for arousal 
of evaluation apprehension and directional cueing of responses in the 
“liking” and “disliking” directions respectively. And again we attempted 
to strengthen the evaluation apprehension dynamic by introducing an 
additional clinical implication into the experimental situation. Thus, be- 
fore they read the Background Information Sheet the subjects in one 
main treatment read a printed announcement concerning an impending 
study. This told them that “Dr. P. J. Schroeder,” a clinical psychologist 
from another institution, had asked our cooperation in recruiting subject 
for a large study on “student personality and adjustment in college^ i e. 
This study was being conducted on various different campuses. Participa- 
tion in it would involve the subject's being interviewed by Dr. Schroeder 
and allowing him to administer various “projective tests of personality. 
Dr. Schroeder, it was made clear, would treat the findings as wmp ete y 
confidential and, specifically, he would not disclose them to the expen- 
menter. The subjects were asked to sign for appointments “for this other, 
unrelated project” if they were so inclined. Virtually all the subjects 
did sign. , , 

When we compare the subjects in this treatment to others who were 
not exposed to it we find the former showing stronger ircctiona iM 
effects than the latter. Under the “Schroeder is coming condition the 
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The purpose of this manipulation was, then, simply to convey that 
the experimenter either had or lacked interest m understanding and 
judging the personalities of others At the same time, for the clinical 
role, It was clear that the experimenter was not yet professionally trained 
or skilled in this direction As he alternated between these two roles 
the experimenter ran his subjects without any knowledge of whether 
they were m the control group or m the groups that had been respec 
tivcly cued to the suggestion that hking or disliking for strangers was 
charactenstic of psychologically mature persons 

One hundred and fifty subjects gave their likmg-dislikmg ratings for 
15 photographs of male faces and the data from 130 of these were 
analyzed (The data from the 20 other subjects were discarded because 
postexpenmental questionnaire data showed that they had not under- 
stood or retamed the content of the like dislike portion of the 
communication ) 


The data clearly indicate that the definition of the experimenter as 
either having or lacking a clinical onentation does, as predicted, have 
some influence upon the amount of systematically biased responding 
by tlie subjects Under both the chnical and nonclmical experimenter 
conditions the control subjects (who received neither evaluation appre 
hension arousal nor directional cueing) lean toward an overall liking 
response pattern, and there is no difference between the mean algebraic 
sums of the ratings for the control subjects run by chnical and nonclinical 
expenmenters For the former the mean of the algebraic sums is +25 20 
and for the latter +23 00 In the clinical expenmenter condition the 
subjccU who received the “disliking is mature” cueing have a mean 
sum 0 +13 while under the nonclmical expenmenter condition the 
mean sum is +5 80 Apparently somewhat greater deflections away from 
the control group basal levels are occumng under the clinical condition 
owever in oth instances the differences from the relevant control 
groups arc quite significant {p < 00003 and p < 0003, respectively) 

A more clear-cut booster effect is obtained with the subjects who 
rcceiN c t c 1 mg is mature cucing The subgroup run by nonclmical 
experimenters a mean sum of -b2080. and this is not significantly 
different from the mean for the nonclmical. control group However, 
clinical expenmenters shows a mean of 
-h39 la This differs significantly both from the means of the clinical 
^"< 003)°”^ nonclmical, liking cued group 


The following conclusions seem reasonable In this subject population 
there was some tendency, as indicated by the control group data, to 
giio moderately positiic judgments of the pictured persons Thus, for 
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one intended to confirm and strengthen original promptings toward eval- 
uation apprehension and the other to disconfirm and weaken them. 

To convey fully the nature of this experiment I must linger briefly 
over the actual operations employed to establish the gatekeeper and 
nongatekeeper and high and low evaluation apprehension variables. 

The gatekeeper manipulation consisted of the experimenter (myself) 
reading aloud a letter that, as he explained it, he had recently received 
from the executive director of the “National Policy Foundation” on 
whose advisory board, as the letter made clear, the experimenter himself 
had served. The letter invited the experimenter to nominate some forty 
•undergraduate students to participate in a “regional collegiate council” 
that would be discussing, as part of a funded national study, college 
student experiences and views on general issues concerning campus life 
and educational practices. These students, the letter made clear, would 
be well paid and through their participation they would “help to educate 
the experts and, hopefully, the academic and governmental heirarchs 
who will be receiving the progress reports and the final report of the 
total study.” 

Having read this letter the experimenter said he had decided not 
to choose from among his own students but rather from the many sub- 
jects who were participating in “this week’s experimental sessions,” and 
that he would be “contacting some of you about this within the next 
week or so.” In the nongafekeeper portion of the study ^ve elected to 
say nothing along these lines, rather than to provide some contrasting 
or filler material. 

The communication designed to arouse and suppress general evalua- 
tion apprehension explained that the study the subject was about to 
participate in was being conducted by the Personality Assessment Project 
(high evaluation apprehension) or the Mathematical Psychology Project 
(low evaluation apprehension). The low evaluation apprehension com- 
munication stressed that we were “interested in integrating the various 
dimensions or factors involved in social perceptual processes into a more 
simplified model by a method which statisticians term 'stochastic-inferen- 
tial mathematical modeling’.” With these, and similar comments, we 
intended to create in the subject the reassurance that we were studying 
some rather obscure and technical issues and had no interest in his 
unique personality. Thus, a further paragraph informed the subject that; 
the photo ratings of all participants will be considered as a group 
and an overall average score and statistical estimate of distribution vari- 
ance will be obtained for this set of photographs.” 

The high evaluation apprehension communication informed the sub- 
jects that “these photographs comprise a recently developed Test of 
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difierencc m ratings beUveen subjects cued m the liking and disliking 
directions respectively is clearly significant (p < 02) Under the stan- 
dard condition comparable to the prior experiment, but lacking any 
extra clinical implication, the comparable finding is p < 10 (Smaller 
samples were used in this study than in the previous one, and with 
\ariances of about the same magnitude the overall probabilities are, 
as would be expected, somewhat larger ) 

As I have already suggested, these are studies of limited import and 
they offer no major surprises Essentially, their value lies in lending 
support to this basic point any aspect of the experimenter (or of the 
situation or setting in which he is encountered) that adds some further 
implication of interest in psychological evaluation will tend to increase 
the influence of the evaluation apprehension dynamic upon the subject’s 
experimental responding This statement assumes, of course, that some 


other provocations toward evaluation apprehension are also acting upon 
the subjects as, for example, the information that we conveyed through 
the Background Information Sheet However, it would seem quite Ukely 
that our additional factors (le, the undergraduate experimenter’s con- 
fessed clinical interest or the subject’s elicited commitment to participate 
in a later personality evaluative study) could operate as swffict&nt factors 
m and of themselves Further research would be required to confirm 
this rather obvious speculation 

But obvious relationships (even when they raise questions about the 
underlying and somewhat obscure sequences of events that mediate 
them) are less compelling than findings that raise new and unexpected 
mues Therefore, rather than linger over the findings reviewed above, 
I shall turn now to some further preliminary studies concerning other 
V ana es In both of these studies the major hypotheses were confirmed, 
but certain unexpected relationships were also encountered, and they 
are 0 a type that promises to deepen our inquiry into the operation 
of the evaluation apprehension dynamic 
In one of these studies we attempted to examine the consequences 
of making the experimenter a “gatekeeper” for the subjects By this 
we meant, simply, that the expenmenter was to be perceived by the 
subjects as likely to allow some, but not all of them into some rewarding 
activity area In addition to setting up gatekeeper and nongatekeeper 
conditions we also treated the manipulation of evaluation apprehension 
in a new way Iri previous studies our Background Information Sheet 
had been designed to arouse (or confirm and amplify) evaluation appre- 
hension and control subjects who did not receive the Background Infor- 
mation Sheet provided the necessary baseline data Hovvever, m the 
present studv we used two forms of the Background Information Sheet, 
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to all subjects After this rating sheet had been completed the sequence 
of pictures was presented again while the subject rated each of the 
pictured persons for “how successfur they had been A third exposure 
of the pictures was then given while the subjects rated the pictured 
persons for “how intelhgent” they appeared ® All pictures were exposed 
for ten seconds each, with a following ten second interval during which 
the subject wrote his rating on a scale from — 10 to -f-10 Two postexpen- 
mental questionnaires, administered both before and after a thorough 
debnefing, provided strong evidence that the manipulations had been 
successful and that very little suspicion had been aroused as to our 
real purpose 

I have so far descnbed the procedures of this study without any 
direct reference to the hypotheses that guided it However, they are 
probably already apparent The gatekeeper manipulation was intended 
to mcrease the desirability of winnmg a positive evaluation from the 
expenmenter, for this would now have the additional payoff value of 
mcreasing the probability of being chosen for membership in the inter- 
esting and remunerative student discussion group that was being set 
up by the “National Policy Foundation ^ Thus we predicted that re- 
sponse dependence upon the directional cueing would be greater for 
subjects in the gatekeeper condition than for those in the nongatekeeper 
condition 

Similarly we expected that subjects receiving the high evaluation ap- 
prehension manipulation would show stronger response bias effects than 
those receivmg the communication that was designed to reduce evalua- 
tion apprehension And, of course, we were interested in the possibility 
of a meaningful mteraction between the two major variables, and also 
their respective and combined interactions with the hke dislike cueing 
vanable 

This rather complex study, with 12 separate cells in a 2 X 2 X 3 de- 
sign, and With considerable data dra\vn from postexpenmental question- 
naires and mquiry, yielded a great deal of information, and full presenta- 
tion and analysis can only be attempted in a lengthy, separate article 
Thus I shall dwell here only upon some of the major findings and their 
probable meanmg 

• It should be clear that the subjects had not received any direcbonal cueing 
concerning the personality revealing relevance of judgments that others ha\e been 
Successful or are intelligent However, judging another as possessing these qualities 
Would represent a positive evaluabon of him Thus we expected some gencralizibon 
from the subjects' judgments on the hlce dishke dimension onto these two other 
ludgmental scales Also, evidence of such generalizabon (or of such indirect cueing 
effects) could be taken as an addibonal measure of the degree to which the dircc- 
bonal cueing was ublized by the subject 
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Pstjchological Matuntij The pattern of responses given to these photo- 
graphs by psychologically mature undergraduate students has been 
shown to be significantly different from the responses of those manifest- 
ing ps) chological maladjustment ” The communication went on to assert 
that our research was pointed toward further development of the picture 
rating personality test as an instrument for picking out from the large 
college population those students who would be judged inadequate m 
their emotional adjusment or generally prone to psychological 
disorders ” 

Postexpenmental questionnaire data established that the low evalua- 


tion apprehension communication did significantly “reassure” those who 
read it while the high evaluation apprehension communication did in- 
crease anxiety and set the person toward concern over what impression 
he would make upon the experimenter 

The contrast with our earlier uses of the Background Information 
Sheet should be clear Not only were we attempting to remove evaluation 
apprehension in some subjects while strengthening it in others, but we 
w ere also providing no cuemg that directly reported that liking or dislik- 
ing for strangers had been found to be characteristic of psychologically 
ma^re persons Instead, a more bmited, or one might say, less obvious 
an intrusive form of directional cueing was employed Each expenmen- 
a su ject, ter he had been exposed to the gatekeeper or nongate- 

ecper an igh or low evaluation apprehension manipulations, read 

\ 0 paragraph communication which simply reported that previous 
resewch with pictures he was about to rate had shotvn that most 
peop e judged them positively (hlaog) or negatively (dishkmg) While 
fo™ population of 148 males received this 

rnmnintna cueing, and another third the dishkmg cuemg, the 

control group”^ received no directional cueing and thus served as a 

a h?gh wTilf "‘Ponn'oot we were able to achieve 

of ChLf SjelTerJ ^ r' 
nongateVeepersub^t^TbneftTqmte^^'fT^" 1““''” 

The subject then riad the high or loTetaCion 

tion Mhich, under instmctiol he had “^ 0 ^^ manipula- 

on the table within his booth He then I T ”, “ ^ 1 

fn rrtr,,! »i, 1 . t unless a cueing control 

heC'e 1 r rif* «>-niunication Following this 

tf,o®irn'.emm ,T ® TT '''""g disliking judgments for each of 
tho 15 pictured faces as they were projected on a screen easily visible 
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In both analyses we obtained clear evidence of a cueing effect. The 
algebraic sums of the subjects" ratings on the like-dislike dimension 
strongly reflect the cueing that was received: those who got positive 
cueing gave more positive ratings than those who got no cueing and 
these, in turn, gave more positive ratings than those who got negative 
cueing. In the nongatekeeper half of the experiment the p value for 
this effect is less than .0001. Also the effect does appear to generalize 
to the ratings of "success” (p < .05) and “intelligence” (p<.006). 

Considering only the four groups that received cueing in either the 
positive or negative direction (i.e. eliminating the two no cueing, control 
groups) our other major prediction was confirmed. In the nongatekeeper 
portion of the study a significant interaction is obtained between cueing 
and evaluation apprehension level as regards the liking ratings 
(p < .03). This is due to the fact that when subjects have been roused 
to a state of evaluation apprehension their picture ratings are more 
extremely influenced by either the positive or negative cueing than when 
they are at a low or suppressed level of evaluation apprehension. Thus, 
the mean of the algebraic sums for the high evaluation apprehension 
subjects who received positive cueing is some 53 points more positive 
than the mean for the high evaluation apprehension subjects who re- 
ceived negative cueing. For the low evaluation apprehension group the 
comparable discrepancy, while in the same direction, is only 25 points. 

Similar effects of lesser magnitude and statistical significance are ob- 
tained when we compare the two evaluation apprehension groups on 
their ratings of the pictures for success and intelligence. The probabilities 
for the overall evaluation apprehension by cueing interactions on these 
two dependent variables are less than .12 and .19, respectively. 

In passing, it is worth noting that within the high evaluation apprehen- 
sion condition the differences between the scores from the positively 
and negatively cued subjects are significant at probabilities of .008 or 
less for each of the three dependent variables; while the parallel analysis 
with the low evaluation apprehension subjects yields a significant proba- 
bility only on the liking ratings. 

I have dwelt upon these results because they suggest a point of par- 
ticular interest both as concerns an emerging theory of the self-presenta- 
tion process and also as they bear upon an important methodological 
issue. The kind of directional cueing intentionally provided in this study 
is often unintentionally present in other research situations, both of ex- 
perimental and survey form (e.g. the respondent in the typical public 
affairs study often has a fairly clear idea, whether accurate or not, of 
“how most people would probably answer” on some of the more salient 
issues). More “valid” data (i.e. more accurate self-representations) are 
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Table III presents the mean algebraic sums of the liking ratings for 
the SLX cells that received the gatekeeper treatment and, separately, for 
the SIX cells m the nongatekeeper treatment The probabilities of the 
differences between relevant pairs of cells are also presented Reference 
to these tables will help to illuminate the findings from the separate 
analyses of variance that were earned out for both the gatekeeper and 
nongatekeeper conditions 
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hypotheses However, where we examine the data from the high evalua- 
tion apprehension gatekeeper subjects, one major surprise is encoun- 
tered Unlike the results with the low evaluation apprehension subjects, 
the introduction of the gatekeeper condition (which was intended as 
an extra force compelling the subject toward reliance upon the direc- 
tional cueing) seems in fact to reduce such reliance for the positively, 
but not negatively, cued group In the high evaluation apprehension 
nongatekeeper and gatekeeper conditions the mean algebraic sums for 
the liking ratings in the absence of any directional cueing are 5 42 and 
5 09, respectively But whereas the liking sums for the positively cued 
subjects in the former group have a mean of 33 50 (and thus the difFer- 
ence between the control and positively cued subjects is 28 08), m the 
latter group the positively cued subjects yield a mean sum of only 13 82 
(making the difference between the control and positively cued subjects 
only 8 73) Similar findings are obtained with the dependent variables 
of success ratings and intelligence ratings 

A possible interpretation is that the combination of the high evaluation 
apprehension and gatekeeper treatments strains the subjects’ credulity 
or, perhaps, puts them under a degree of tension which inhibits or 
otherwise disrupts their readiness to be influenced by the directional 
cueing But the absence of the same pattern with the negatively cued 
groups limits the applicability of this interpretation Subtler possibilities 
have occurred to us, but their explicabon had best await the results 
of further data analyses that are yet to be executed These last findings 
comprise one of the valuable surprises of which I spoke earlier, and 
I must confess considerable interest in further experimental investigation 
in this particular realm as well as considerable frustration over the 
tantalizing ambiguity that presently beclouds the issue 

Among many further subsidiary findings obtained in this experiment 
I shall mention only one other A postexpenmental index of the anxiety 
aroused by the high evaluation apprehension communication is strongly 
correlated with the degree to which the subjects in the experimental 
groups were influenced by the direchonal cueing that they received 
This serves to reinforce our general theoretical view while also suggest- 
ing the importance of apprehension proneness as a mediating, persona 
ity-linked vanable 

While I have not here attempted a full descnption of the procedures 
of this complex study or of all the available analyses, enough has been 
presented to make clear the basis for the following conclusions Evalua 
bon apprehension has again been shown to be a factor, or process, 
Aat mediates systematic biasing of the sort that is due to cueing (in 
this study, somewhat more indirect cueing than in our previous work) 
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likely to be obtained \%hen we attempt to reduce evaluation apprehen- 
sion through some preliminary communication which disconfirms the 
subject’s or respondent’s concern that his psychological maturity (or, 
for that matter, his public spinledness” or “patriotism”) may be open 
to assessment and evaluative judgment 
Yet the fact is that even with an apparently successful reduction of 
e\aluation apprehension (judging by tlie postexperimental questionnaire 
data from the low evaluation apprehension subjects) the directional 
cueing still exerts some influence Probably this indicates some residuum 
of persisting evaluation apprehension and, if so interpreted, it points 
up the necessity for developing even more effective techniques for giving 
subjects or respondents the sort of reassurance which allows them to 
be their typical selves ( i e uninfluenced by situational and inadvertent 
cueing factors) when reporting on their own judgmental or attitudinal 
processes 


So far the discussion has been restricted to the findings from the 
nongalekeeper portion of the experiment With the data from the gate- 
eeper portion of this study we encounter a number of interesting pat- 
terns, particularly when they are viewed in relation to the comparable 
nonp e eeper experimental groups Whereas the positively and nega- 
nongalekeeper, low evaluation apprehension 
Viiii ^ significantly only on their liking ratings of the pictures, 
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mental standards, exert some influence upon his ratings. This is likely 
to be true even when a larger part of the variance in the ratings is 
controlled by the arousal and directional channeling of evaluation appre- 
hension. To make the pictures more ambiguous is to make the stimulus 
attributes less readily available. Tliis, in turn, should foster a further 
intensification of the subject’s reliance upon such cueing as he may 
have received and thus the bias effects should be intensified. 

Table IV presents the mean algebraic sums of the liking ratings for 
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of tlie preferred pattern of expemnental responding A second variable, 
namely the perception of the expenmenter as a ‘ gatekeeper” ( i e , as 
one who controls access to further reward or ego enhancement) has 
been shown to facilitate yielding to directional cueing, particularly when 
evaluation apprehension has been brought to a low, or inoperative, level 
But the combination of high evaluation apprehension and the gatekeeper 
\ariables has not, as we thought it would, worked to maximize the 
degree of influence upon experimental responding that is exerted by 
directional cuemg Whether this is due to some artifactual considerations 


(or to some unintended and subtler pattern of evaluation apprehension 
that has, in turn, generated a more obscure response strategy) or 
whether it is our first encounter with a truly general effect, remains 
to be determined through further research 
In general this study does appear to add force to the claim that evalua- 
tion apprehension can contaminate the data gathering process, and it 
directs us toward a more complex consideration of other variables that 
interact with evaluation apprehension 
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gesting that most persons in past research have rated the pictures nega- 
tively face a conflict between their own expectations or half-shaped 
hypothesis and the directional cueing that has been addressed to them. 
With stimulus ambiguity high they may, in the resultant state of uncer- 
tainty, fall back upon their own, original expectations, and thus the 
positive cueing works more effectively upon them than does the negative 
cueing. However, with high clarity and detail in the photographs, typical 
subjects may be able to find evidence in facial and expressive characteris- 
tics onto which they can more readily impose the negative judgments 
that, according to the negative directional cueing, are typically made 
by “most people” who view these particular photographs. 

That the yielding to the negative cueing under the nonambiguous 
condition is greater for high evaluation apprehension than low evaluation 
apprehension subjects (the difference between control and dislike group 
means being 38.91 for the former and 24.72 for the latter) suggests 
the further pertinence of the interpretation offered here; for the high 
evaluation apprehension subjects, believing they are undergoing indirect 
personality assessment, have a greater stake in regulating their responses 
in the cued direction. In effect, our interpretation, reduced to its simplest 
form, suggests this further hypothesis: to yield to directional cueing 
that endorses an unpracticed response style, Ae person needs something 
to work with,” i,e. some supporting aspects in the experimental situation 
or in the profferred stimulus material which wnll enable him to wew 
his yielding to the directional cueing as having some basis in reality 
rather than solely in his need to win a positive evaluation. 

Clearly this line of speculation, if strengthened by later research, 
moves our inquiry into self-presentation processes toward a subtler and 
more difficult kind of theorizing; one which will have to give fuller 
representation than heretofore to the limits and lures that the total ex- 
perimental context provides for the subject who is attempting to 
bis experimental responding in a way that ser\'es both his nee or 

approval from others and, at the same lime, from himself. 

As I said in opening this section, “we are just barely starte on I is 
line of inquiry.” Having now reviewed our completed studies on vari- 
ables that strengthen or reduce the data biasing influence of evaluation 
apprehension I am all the more sensitive to the fact that this "or * 
bas a decidedly preliminary air about it. Much more inqui^' is required 
and as it proceeds we must get beyond our present and too simple 
classfficatory taxonomy of variables and into the construction o a process 
or systems model of the flow of the evaluation apprehension d>uamic. 
Further work along these lines, both experimental and theoretical, ^ 
contemplated. But for now we can. I think, conclude that at least tins 
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the SIX cells m the ambiguity treatment and, separately, for the six cells 
in the nonambiguity treatment The significant differences reported in 
the table help to make clear the findings from the analyses of variance 
that we carried out for both the ambiguity and nonambiguity conditions 
Analysis of variance of the ratings from the half of the study in which 
the subjects rated the unambiguous photographs reveals comparatively 
strong cuemg effects On the liking ratings the cueing effect is highly 
significant (p < 0001), and for the success and mtelhgence ratings they 
are of borderlme significance (p < 15, p < 07 respectively) 

Analysis of vanance of the ratings from the half of the study run 
under the condition of stimulus ambiguity also reveals a significant main 
effect for the likmg ratings (p < 002), but no effect for the success 

rabngs (p < 68), and a borderline effect for the intelligence ratings 
(p < 14) SB 

However, while the like dislike directional cueing exerts the predicted 
influence we ffnd that two other expectations are not directly confirmed 
Within the separate ambiguity conditions we do not find that the high 
etataation apprehension subjects are significantly more influenced by 
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gestmg that most persons m past research ha\e rated the pictures nega 
tively face a conflict between their own expectations or half shaped 
hypothesis and the directional cueing that has been addressed to them 
With stimulus ambiguity high they may, m the resultant state of uncer- 
tamty, fall bach upon their own, ongmal expectations, and thus the 
positive cuemg works more effechvely upon them than does the negati\e 
cuemg However, with high clarity and detail m the photographs, typical 
subjects may be able to find evidence m facial and expressne charactens 
tics onto which they can more readily impose the negati\e judgments 
that, accordmg to the negative directional cuemg are typically made 
by “most people” who view these particular photographs 
That the yielding to the negative cueing under the nommbiguous 
condition is greater for high evaluation apprehension than low evaluation 
apprehension subjects (the difference between control and dislike group 
means bemg 38 91 for the former and 24 72 for the latter) suggests 
the further pertmence of the mterprelation offered here for the high 
evaluation apprehension subjects behevmg they are undergoing indirect 
personahty assessment, have a greater stake m regulating their responses 
in the cued direction In effect, our interpretation reduced to its simplest 
form, suggests this further hypothesis to yield to directional cueing 
that endorses an unpracbced response style, the person needs something 
to work with,” i e some supporting aspects m the expenmental situation 
or in the profferred stimulus matenrd which will enable him to uew 
his yielding to the directional cuemg as having some basis in reahtj 
rather than solely in his need to win a positi\e evaluation 

Clearly this line of speculation, if strengthened by later rcsearc , 
moves our inquiry into self presentation processes toward a subtler and 
more difficult kind of theonzing one which wall ha\e to gi\e u cr 
representation than heretofore to the limits and lures that t ic tola ex 
perimental context provides for the subject who is attempting to rc^ a e 
his expenmental responding in a way that sciacs both ns ncc or 
approval from others and, at tlie same time, from himself 

As I said in opening this section, “we arc just barely slartc on us 
line of inquiry ” Ha\ing now rcMCWcd our completed stucics on \an 
ables that strcngtlien or reduce the data biasing influence o c\a in ion 
apprehension I am all the more scnsitwc to the fact t lat I U5 wor 
has a decidedly preliminary air about it Much more inquiry is require^ 
and as it proceeds we must get beyond our present and too s^ic 
classificalory taxonomy of y anablcs and into the constniction of n process 
or systems model of the flow of the evaluation apprch^sion dyanmic 
Turther work along these lines, both expenmental and ^ 

contemplated But for now we can, I llunk, conclude tin a ta 
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much has been established Between the initial arousal of evaluation 
apprehension and the ultimate tilting of experimental responses in the 
direction that, as the subject sees it, will maximize positive evaluation, 
there is scope for influence through many intervening and subsidiary 
variables The few we have so far investigated appear to me to derive 
their influence m either or both of two ways they may directly affect 
the subject’s perceptions of how his responses will be judged, or they 
may affect his estimate of the importance of winning a positive evalua- 
tion from the particular experimenter in his particular experimental 
setlmg 


V. EVALUATION APPREHENSION AND THE EXPERIMENTER 
EXPECTANCY EFFECT 

"^ree research strategies have been featured m the work I have al- 
ready reported altered replication, demonstration experiments and ex- 
penments on intervening or additive variables Yet one other related 
research approach has Bgured in our recent work on the evaluation 
iiah'n Simply descnbed, this involves manipulating eval- 

siinnrliec^^^^ a^ton (by arousing or confirming it for some groups and 
ouences'tn ” '^aonfirming it for others) and then exammmg the conse- 
interest phenomenon or relationship of psychological 

one susnectf'iW "'““W appear to be relevant whenever 

facihtahnv m A apprehension operates as a mediating or 

other vanables" ' ™ established relationship between 

apXnLntTy'weU i“| “ 'I? 

effect 1 e a involved m the expenmenter expectancy 

^m1s -Lra\nrtb:orarra"^'*^^ 

perceiies the expenmenter's m the way m which the subject 

withm the expenLntal situatiorToPie 7 

Rosenthal (1966) and Fnedman ( 1 ^^ fr*®: ' 1 ^ 

expectancy is suhdy commumcatJ h ^“gg^sts the experimenters 
the subject who is possessed of ^ 'expressive style 

bemorc%loselyandLum,l atl™,"m™ 

or he may he more motivated to act^pL '"‘^■"e'et 'e^t^'ficaUon, 
indirectly communicated P 

subjects-one aroused to a^high .XeroteriLu;ra;pXnTn®l^ 



CONDITIONS & CONSEQUENCES OF EVALUATION APPREHENSION 323 


one m which all tendencies toward this pattern of concern ha\e been 
effectively diminished 

In a sense this research strategy can be viewed not as a fouith and 
new one, but as a vanant of the altered replication approach descnbed 
in the first section of this chapter, but m this variant instead of elimi 
nating evaluation apprehension we attempt also to arouse it Howe\er 
one Wishes to classify it, this strata^ has proved effective in the one 
realm in which it has already been employed As the title of this section 
and the illustration offered above have already suggested, that realm has 
been the further study of the mediation of the experimenter expectancy 
effect 

Before I turn to an account of our studies in this area I should like 
to comment briefly upon the relationship between my own preoccupation 
With the evaluation apprehension process and the work of other mvesti 
gators of the “social psychology of the psychological experiment ’ From 
the record of research (much of which is summanzed in other chapters 
in its volume) on demand characteristics, subject presensitization, vol 
unteer effects, and experimenter expectancy effects, it seems abundantly 
clear that there are a number of sources of systematic bias m experi 
mental data For a long time these went unsuspected and, it can be 
assumed, contributed considerable nonrandom error to the data through 
which theoretical propositions were tested or inspired 

In the mam I am persuaded by the work of others that the vanous 
processes that have been conceived as making for systematic bias do, 
m fact, have considerable operative force And, obviously, I think and 
have tried to show that the same is true of the evaluation apprehension 
process 

We have then developed an empincally verified catalogue of data 
hiasmg variables and processes So far so good But it seems apparent 
to me that we have now reached a stage at which we need not e 
content with a mere catalogue Some larger, more integrative theory 
of the experimental transactional process is required The development 
of such a theory xvill afford intellectual satisfaction m itself, but, equally 
important, it will probably also contnbute to a richer understanding 
of the role of self representational dynamics in nonexpenmental, 
bon situations, and, of course, it will promise considerable further ad 
vance m improving the methods of research design and execution in 
all those disciplines (psychology is only one) whose data are gathered 
through interaction behveen the mvestigator and other, investigated 
persons 

I shall not presume to suggest the possible shape of a full and general 
integrative theory of the experimental process, though in tlie concluding 
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section I shall nsk a few preliminary speculations upon some aspects 
of such a theory However, at this pomt I want only to register this 
obvious point the development of this sort of theory will be advanced 
by — indeed, it may require — the prior investigation of the interaction 
and overlap between the biasing processes that are now separately de- 
lineated in our catalogue A few investigations of this type have already 
been attempted, the study by Sigall Aronson, and Van Hoose (1968) 
discussed above, is one The three studies I shall now describe represent 
another such contribution They are all focused upon the interaction 
between evaluation apprehension and experimenter expectancy More 
particularly they are attempts to test the proposition already advanced 
1 e that the experimenter expectancy effect is mediated or facilitated 
by evaluation apprehension At the same time, the last of these studies 
also bears upon another important aspect of the experimenter expectancy 
e ect, namel) , the paralinguistic content of the experimenter’s communi- 
cations to the subject 

In our first effort in this realm my comvestigator was Marshall Minor 
lor-rv served as his doctoral dissertation (Minor, 

^vo b^ic purposes to replicate Rosenthal’s finding that 
L G d by an expenmenter can mtroduce “experimenter 
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experimenter to keep his scheduled appomtment, reduced the actual 
situation to one in which 15 expenmenters ran 23 male and 16 female 
subjects However, it was possible to maintain partial balance in subject 
assignment and to effect statistical analytic controls for the holes’ in 
the matrix of experimenter-subject pairs that were actually completed 
The latter type of control was made possible by use of the University 
of Chicago MESA 95 computer program (we are indebted to Professor 
Darrel Bock for introducing us to this program and for lielping us to 
fit it to our needs) in which, as each mean square for an effect is deter- 
mined, a transformation is made on the model by eliminating previously 
estimated effects from the succeeding mean squares This made it pos- 
sible for us to get an estimate of the expectancy X evaluation apprehen- 
sion interaction which estimate was independent of sex and sequence 
variables (A full account of the handling of this and other analytic 
problems is found in Minor, 1967 and the special analysis of vanance 
program employed is described m Bock, 1965 ) 

Some words about procedures and operations are required before 
we review the basic findings of this study 
The naive experimenters, all male graduate students in the school 
of education, reported individually for the scheduled experiment in 
whose execution they had promised to assist After receiving a standard- 
ized background introduction to the expenmental task each experimenter 
read a document giving further instructions These expectancy establish* 
mg instructions compnsed a modified version of one of the methods 
used by Rosenthal ( I960) The crucial content was in the last paragraph 


“For your information, we have found from past research that certain 
types of people tend to rale these pictures m very particular wi^s 
On the basis of personahty test data that wc collected earlier this quarter 
from the subjects you will be running, you should expect t cm to guc 
an overall average rating of +5 ( -5) This is because our earlier testing 
kas shown that all of the subjects whom you will be running arc m 
the category that we are calling ‘success pcrceivcrs (‘failure pcrccucrs ; 
Thus if you will follow instructions and use proper expenmen a proc 
dure, they will rate the pictures more extremely (extreme > 

uegitivc) — thus obtaining an axcrage of about +5 ( 5) « 

kc the case if >our subjects Imd simply been selected at random 

Each subject, before being introduced to the cxj^nincntcr ‘ 

to administer the expenmental task, was asked to p ease rci i „ 
uhich will give >ou a brief dcscnplion of the puq)Osc of toe i> s p 
fumf Half of the subjects read an explanation intend^ to 
k>gh and stable magnitude of cxalualion apprehension Tlic oilier 
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read an explanation designed to reduce and, hopefully, eliminate any 
tendency toward evaluation apprehension that the subjects might have 
brought with them into the expenmentol situation The full logic that 
la) behind the construction of these two communications (which are 
similar to the ones used in our earlier studies on the gatekeeper and 
ambiguity variables) will not be traced here However, much of it should 
be apparent from the content of the high and low evaluation apprehen- 
sion communications which are fully reproduced below 


High Evaluatiov Apprehension Communication 
Today, you will be participating m a psychological experiment, and 
shortly )ou will be assigned to an expenmenter who will explain the 
task to you Although we are not able to answer any questions until 
after the experiment is over, we do want to give you a brief description 
0 e purpose of the experiment This should make participatmg more 
mteresting and meaningful for you Also, a growing number of psycho- 
logical researchers are beginning to realize that they have an ethical 
1 purpose of their expenments known to the 

^ '''' ^ helpmg them out by participating m their research 
social perception (le how people 
which specifically, we want to find the factors 

of other accuracy of an individual’s perception 

?odav ve S C understanding m the world 

social Berce^hon and others indicates that, typically, poor 

who are not able to "’** Psychopathology That is, people 

or what they are ev pcrceive how other people are feeling, 

maladjusted Much orour’mUiT’”^ P^y''''°'°g‘“’'y 

on the basis of nerfonunT, ™ this area indicates that 

out from a colFege populahon ‘ask, we can pick 

clinically to be maladjusted * students who would be judged 

ceding findings hTorgMlndTro support the pre- 
Cholo^, 1963^ for exa^^f <1 abnormal and Social Psy- 

Percephon Test could mS.e rate sSe ^ 

mg degrees of emotional maladjustmen^td “oTakr^ 

The purpose of today s expenmeut tn. "“unaicy 
ous results, and thus to tesfftirthcr^ftr''”"'’ P’'!"' 

people who cannot accurately judge it f 
tend to be psjchologically mahdjmted " ’’“P ^ expenencmg 
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Low Evaluation Apprehension Communication 

‘Today, you will be helping us to collect some preliminaiy data which 
we will use in setting up a subsequent research project Shortly, )ou 
Will be assigned to an experimenter who v\ili explain the tisk to ^ou 
Time does not permit us to answer any questions, but ne are able 
to give you a bnef descnption of the purpose of the stud) This should 
make participatmg more interesting and meanmgful for you 

We are interested in studying social perception (le how people 
perceive other people) More specifically, we want to find the factors 
(eg, fatigue, practice, etc ) which increase or decrease the accuracy 
of an mdividual’s perception of other people 

Before we can investigate these different factors, however, we have 
to know how people perceive the feelings and experiences of others 
when these experimental factors are not present 

That IS, we need a control, or standardization, group to use as a 
baseline agamst which we can judge the effects tliat our cxpcnmcntal 
factors have on social perception This is the reason for ) 0 ur participation 
today 

We intend to average the performance of all of the students participat- 
ing today, so that we will have a measure of how subjects perform 
on the task when such experimental variables as fatigue and pnor prac 
tice are not present This information will allow' us to judge the effects 
which our experimental vanables have when they are used with a sub- 
sequent group of students 

In other words, today’s group will help us to find out how subjects 
typically perform on the task Later, we can use the data wc rcccwc 
here to judge the performances of subsequent experimental groups o 

subjects” 

As m the typical Rosenthal experiment, interaction between expen 
uicnter and subject was held to a minimum ]c\cl m winch t ic 
J^Pnler read the picture rating instructions to the subject and collectcti 
bis ratings for each of the ten pictures Upon completion of Hus phase 
the subject, no longer in contact with the cxpcnmenlcr, filled out an 
cxlcnswe postcxpcnmcntal questionnaire and was tboroug i > m cr 
Mcwcd The same was done with each experinicnlcr after he had com- 
P cled running all of his assigned subjects In a last phase cxptnmcn c 
^nd subjects were brought together for a full debnefing am or 
tended discussion, considerable care being taken to allt\ntt an) 
concern that might be felt b> subjects who had been assignet 

1C high c\alualion condition , « 

Prom the full anal)*sis of %anancc three significant findings wen. o 
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tamed Between expenmenters, the expectancy vanable (+5 versus — 5 
experimenter expectancy) controls a significant portion of the variance 
in their subjects’ ratings of how successful the pictured persons have 
been {p < 05) In the "within experimenters” analysis the sex of the 
subjects operates significantly (p < 03) reflecting a general tendency 
for females in either the +5 or — 5 expectancy groups to rate the pic- 
tured persons as less successful than the respective male subjects in 
the same expectancy treatments Most relevant to our major interest 
IS the finding of a rather strong interaction between expectancy and 
evaluation apprehension (p < 02) 

The basis for this significant interaction is clearly revealed by a com- 
panson of the mean photo ratings obtained from the -f-5 and — 5 ex- 
pectancy groups under both the high and low evaluation apprehension 
con itions respectively ( The male female proportions are roughly 
equiva ent m each of these four groups ) With evaluation apprehension 
* 1 ?^ 5'^Ppressed (le under the low evaluation apprehension treat- 
7Q naean picture ratings for the -t-5 and —5 expectancy groups 
m Tini- T IT ’^*16 difference beUveen these means 

-4-5 evaluation apprehension condition the 

Thu! fliff sj^pec an^ group means are -f 16 and —1 06 respectively 

conclusions These maters^wiiriL^"'^ further strengthen our overall 
publication However, one particular"''’^ “ separate 

here An index reflectmg de Jee “°‘“® 

successful in inducing bifs under 

dition was computed This mile* apprehension con- 

scores from vanons qnestionnriturarweTr'^ oonrelated with the 

menters after they had tun all tl,» administered to the expen- 

tions obtained wem thosTvsnl 

scale (r = 40, p < 06) and q f^owe-Crowne Social Desirability 
P< 01) Th. LfgesJ the *:.brhrth?^‘ 

apprehension is involved not^nly evaluabon 

the subject to the experimenter s bms mduo 2® " 

m sethng the experimenter to emrt such on ” but perhaps also 

suggest an empmeal hypothesis xvorthy otfctit. T 

_i, AX':' “t 
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Predicted 
order of 
means 

High EA 
+ 5EE 

Low EA 
+5EE 

Low EA 
-SEE 

High EA 
-SEE 

Obtained 

means 

+ •16 

- 78 

-•59 

-1-06 


FIGURE 1 

Response to Experimenter Expectancy as a Function or 
Evaluatiov Apprehension LE>Et. 

^^oct performance when the experimenters to \\hom these expectancies 
been assigned have a high need for approx al and a ten cnc) 

® apprehensive over the evaluation of Uieir own competence 
Upon completion of the anaUsis of this study xve d^idcd to attempt 
oltercd and expanded replication Tlie major intended c nnges 
to completely fill the matn'C of required cvpcnmcnler suhjrct rom- 
•^walions and thus handle the problem of sequence clfccts 
wurse to the sort of statistical corrections tint Mere requircii in lu 
P^^joiis study, to run an “esalualion apprehension contro 

Me Mould attempt neither to increase nor diminish tin j 
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ongmal nonmanipulated evaluation apprehension level, to run a “zero 
expectancy’ as well as a +5 and — 5 expectancy condition A further 
purpose was to try out a way of staging the study which combined 
some features of mass admmistration (eg subjects reading the initial 
evaluation apprehension communications while waiting in a large recep- 
tion room) with the individual running of subjects on the picture rabng 
task Our hope in this last regard was to increase the efficiency of our 
own experimental procedures and those employed earlier by the Rosen- 
thal group 

In this study, then, each of 33 male experimenters { 11 each having 
een given the -|-5 0, or — 5 expectancies respectively) ran 3 male 
subjects on the Rosenthal picture rating task (one each having first 
receive e igh, low, or control evaluation apprehension communica- 
ion mspective y ) The first two of these communications were slightly 
modified versions of the ones used in the earlier study and the last 
was a simpler one that merely advised the subject that he would shortly 
be^assigned to an espenmenter and asking him to wait until called 


re^cateZh'™"^ failed to 

replicate the basic experimenter expectancy effect 

both sukects ^ questionnaire and interview data from 

Z ttmrnrorr' *0 max^ 

to have aroused considerau/ ^ and experimenters we seem 

poses and about thp a i i ^bout our own unrevealed pur- 

to manipulate evaluation apprete^iOT 

the suspiciol arLfmg'’rspects of 

may be said that a number of valuahr’''’®"'"''"^ procedures used, it 
to US and that we have nrofifpH f ^«tionaiy points became clear 

at expenmental investigation of ewerZ attempts 

Sion and kindred prockses In fah rZ apprehen 

at running a partiaUy group admmiZZ^® attempt 

ivere able to develop a differe„TZ v'T™™‘ 

study in this sequence This *as used m our next 

and yet seems to keep suspieioSZd'ouT efficient or more so, 

low level ^®r mtnisive artifacts at a very 


Before tummg to a descnption of tl.„ „ 

It will be necessary to briefly describe a sh referred to 

our earlier work but was not undertaken as r, J stimulated by 

Starkey Duncan, a clinical paychok,g.st'’.Z“tl rertr„?Tn 
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paralinguistic aspects of commumcation within the psychotherapeutic 
situation, became interested in our work on cxpenmenter bias effects 
Through our joint consultations he came to the conclusion that the 
mediation of biasing cues in the Rosenthal paradigmatic situation might 
largely depend upon vanations in the nonlmguistic aspects of tlie e.x'pen 
menters spoken communications to the subject Particularly, he conjec 
tured that the way in which the experimenter vaned the intensit}, in- 
tonation, pitch, and rhythm aspeeb of his readmg of the instructions 
for the picture ratmg task might convey to the subject an extra linguistic 
(or, more properly, a “paralmguistic”) indication of the experimenter’s 
expectancy regarding the responses the subject was about to make 
Duncan and Rosenthal proceeded to design a prehmmary study to 
test this hypothesis From films provided by Rosenthal, Duncan tran 
scribed sound tapes of vocal readings of the instructions three from 
each of two comparatively high biasing experimenters and four from 
a third high biasmg experimenter Together with the films from which 
the tapes were made, Rosenthal also provided the picture rating data 
obtamed from the respective subjects who had received these separate 
vocal readmgs of the instructions The taped readings were blindly coded 
on a number of different paralmguistic dimensions The coding proce 
dure used was based upon Duncan's earlier work This procedure is 
extremely detailed and, with tramed coders, yields high inter judge re 
liability scores 

While the coding method will not be further desenbed here, the results 
of this preliminary study can be simply summanzed Based only upon 
the codmg of the instruction-reading tapes, Duncan was able to demon- 
strate that a large amount of the variance in the mean picture ^ratings 
given by the subjects could be accounted for by reference to the “Differ- 
ential Emphasis Score” for each of the separate mstruction readings 
^the respective subjects had received , 

The Differenbal Emphasis score is a single index which reflects the 
e^ee to which the experimenter, in his vocal reading, has emp lasizc 
(through vanations m volume, pitch, rhythm, etc ) either “success or 
^ilure’ and either the positive or negative ends of the rating sea c 
^0 correlation beUveen differential vocal emphasis and Uic subjects 
subsequent picture rabngs was -f 72 (p< 01), and all subjects «h 
bad heard greater emphasis on the raUng altematn cs associated ^^>l 
subsequently rated the photos as being of more successful people 
than the subjects \4o heard readings that placed greater emphasis o 

uic failure alternatives (p< 001) . , 

^ additional finding of considerable interest that the 
'cen expenmenters’ assigned expectancies and the Di cren i 
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phasis scores was only 24 This suggests that though the pattern of 
emphasis used by the experimenters is influenced by the assigned expec- 
tanc) it often varies from that expectancy cither in the direction of 
giving greater or lesser than axerage emphasis to it It suggests further 
that even where the relation between assigned expectancy and the sub- 
jects’ picture ratings is low, the experimenter may actually be mfluencing 
the subject (through his deviant pattern of vocal emphasis) a good 
deal more than has previously been suspected 
From this preliminary study it seemed clear that with paralmguistic 
analysis considerable further progress could be achieved in pursuit of 
the difficult question of just how experimenter expectancy effects are 
mediated Since the Duncan-Rosenlhal study had used a vanant of the 
method of postdiction it seemed especially desirable to attempt a more 
ambitious and more fully controlled study We would reverse the proce- 
dure, moving from postdiction to prediction, this would be accomplished 
by exposing subjects to vocal readings selected for their paralmguistic 
irectiTO (i e success or failure’ ) and the degree of differential para 
n^is ic emphasis Thus by expenmental manipulation we could gam 
of the hypothesis that in the typical 
contflininnf^^ studies that may be inadvertently 

thrauph na^ I ^ effects) the subject’s responses are influenced 
him ® ^ ^ inguis ic aspects of the experimenter’s communication to 

wav in pljtnned to extend our earlier inquiry into the 

« -ffoct IS meLted by the 

I was loined bv Ti apprehension Thus the next study (in which 

m^ve was an e^pL^ental. 

upon subiects’ ^ separate and mteracting influences 

evaluation appreheLion P‘>'“'‘"g“‘stic emphasis and 

uate students taped Traings"o{ °f coUeagues and grad- 

picture rating task Our r ® instructions for the Rosenthal 

be “slightly shaded’ either a posihve"l 

negative (le “failure” stressinvl d stressing) or 

mentors" heard the reading of 

ancing” and “shadm^’ m stnctly hiJo™ 

for paraLguishrSifferLtaTEmpS’rw^ transcribed and scored 
of mne readmgs (three from each^^ot three readmsw'* h° “ the 

study Ue diree insttuction readmgs tafen^tlrorthr espet 
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menters were scored as balanced (le no difFerentnl emphasis), moder 
ate positive (le intermediate bias toward an emphasis on perceiving 
the pictured persons as successful), and strong positive, respectively 
From a second reader we had balanced, moderate negative, and strong 
negative readings, and from a third we had balanced, moderate positive, 
and moderate negative readings 


In the basic design of this study each of the nine instruction tapes 
was combmed with each of three evaluation apprehension conditions, 
thus yielding a 27 cell design The evaluation apprehension conditions 
employed were High, Control, and Low As in our earlier studies tlie 
evaluation apprehension manipulations were effected through a Bach 
ground Information Sheet’ read by the subject The evaluation appre 
hension bolstering and suppressing communications were similar to those 
used in earher studies The control evaluation apprehension group re 
ceived no Background Information Sheet and was given no advance 
explanation of the expenment ” 

The subjects were 216 female undergraduates (eight per cell) who 
had volunteered m response to telephone calls requesting their participa 
hon in a study of person perception No payment or other rewards were 
offered All experimental sessions were run m the University of Chicago 
language laboratory In this facility the separate listening booths with 
uiulti channel receivers could be easily adapted to a basic requiromcn 
of our design namely, that within each administration group (N vaned 
f^oni 8 to 12 for the successive groups) each of the three thir s wou 
respectively hear one of the three different readings of the instructions 
recorded by a single expenmenter „ , . 

At the beginmng of the expenmental session, after a su jec 
seated m their randomly assigned booths, they first hear a tape me 
'=>ge thaiJang them for coming and, for the high and low evaluation 
apprehension groups, directmg them to read the Backpoun n 

heet which was m a packet in front of the subject After a n 

for this purpose (control subjects were run m separate gmnp 
Wd had no such pause) each subject heard one of the ape , j 
“f ‘he Rosenthal instructions Immedntcl) following _tliis ‘he pho^ 
graphs to be rated for degree of "success" or “failure "e^ P™iee 
'>"‘0 a screen m front of tL booths, each for ten seconds II e snbjcc 
recorded their own ratings on a standardized rabng s lee ^jj^ded 
"ere also required to sign After the rating ^^ts had bee 
* postcxpenmental questionnaire was distributed an o debnefing 
P’ehon and collection all subjects went through a U.oroi.gh debn g 
“■"I '^ere pledged to keep the purpose and design of the sliid) 

eeefidential 
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Before data analysis was undertaken 35 subjects were eliminated on 
the basis of important manipulabon validation items from the post- 
c\penmcntal questionnaire Thirteen were eliminated because they indi- 
cated that tlie) had been aware of the purpose of the experiment, and 
22 uere eliminated either because they were in the low evaluation appre- 
hension conditions and rated the Background Information Sheet as 
anxiety arousing’ or m the high evaluation apprehension condition and 
rated the Background Information Sheet ‘reassuring ” 

Analysis of the data was based on the mean picture rating for each 
subject Because comparisons between experimenters were not made 
in transcribing or scoring the readings of the mstructions, and thus were 
not reflected m the Differential Emphasis scores, it was necessary to 
a just the subject means to take into account any differences among 
the experimenters Tor each experimenter, therefore, the mean of all 
su jects in his control condition (le the mean of the picture rating 
means from the subjects who heard his ‘^alanced” reading of the mstruc- 
j £ not received evaluation apprehension manipulation) was 
wVift A separate means of all his other subjects (le those 

indicated no significant difference in bias indue- 

rcadi„r “ o? P““>ve 

heard the mod ^ ° j'* or between the subjects who 

bctiveen those whn'h''^’d°'' ‘"'P®®*'”"- 'I'n'i'' differences were visible 
iihile those Mho hear^’tahnre'd- “"'^,"®S®‘‘'’® respectively, 

position It was apparent then °®®“P‘®‘J ““ intermediate 

umc, pitch, and rh^ythm were lusfi a 

in conveying a differential eJnU ®®®®V''® ^ "'“'■® pronounced ones 
patterns In our further analvLT ‘"j“®"®®^ response 

"ho had heard the modeSnd stlT h 

tions and, separately from tl,„ ‘‘P"® P“itive readings of the instruc- 
ncgative madings ''®®'''J *® modlrate and strong 

When Me tested the differenen 

had received die positive differential T ’'®‘"'®®'' ®'' objects who 

ccivcd the negative differential emphaS’ ‘'' 1 °''' ’■®' 

confirmed (p < 02) pnasis, the predicted mam effect was 

In a further and more detailed analvsiv it.„ r ,i. 

SIS separate cells were arranecd in the , j "®°''®" *“ 

predicted fmm the assumptio^lhat the Sf of d ‘I’”! ““'h 
would be facilitated to the degree that e- . differential emphasis 
mat evaluation apprehension was 
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experienced. That predicted order was: high EA, negative differential 
emphasis; control EA, negative emphasis; low EA, negative emphasis; 
low EA, positive emphasis; control EA, positive emphasis; high EA, 
positive emphasis. An analysis of variance was executed to determine 
whether the predicted order did, in fact, obtain. The resultant linear 
trend was found to be significant (p < .02). 

Figure 2 reveals the basis for this summary statistic and reports further 
probabilities obtained dirough application of the Mann-Whitney Rank 
Sum Test. Thus subjects who had first read the Background Information 
Sheet designed to remove or reduce evaluation apprehension were ap- 
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Fuvenov or Evaluation' ArpnrJiENsios- Ix^tl 
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parently uninfluenced by the differential emphases conveyed in the vari- 
ous instruction tapes that they heard, no significant difference is found 
bet\\een the scores of the low evaluation apprehension groups that re- 
spectively heard posihvely and negatively biased readings of Ae instruc- 
tions On the other hand, when subjects have read a communication 
designed to confirm and heighten evaluation apprehension their picture 
ratings are very strongly influenced by the paralmguistic shading of 
the instruction tapes m either positive or negative directions (p < 006) 
Similarly we find that when evaluation apprehension is not manipu- 
lated (le when, in the evaluation apprehension control condition, it 
IS allowed to operate at a level that we may assume to be set by the 
interaction between the expenmental task and the subject’s personality) 
we obtain a smaller, but still significant, difference between subjects 
^osed to positively and negatively shaded readings of the mstructions 
e scope of this difference (p < 02) is roughly the same as that re 
ported in typical successful expenments by the Rosenthal group 
u er and more detailed analysis of these data remains to be earned 
out-particularly an analysis program that will draw upon matenal 
gathered through an open ended postexpenmental questionnaire But 
reported above we feel that we can 
bn^ ceary iscern the nature and dynamics of the expenmenter 
tiaralinfniKH/'°°^^ { process appears to be one m which subtle 
lus exopTtnnr^ ^ ^gs in the expenmenter’s communications do convey 
subicet mint mol” regirds the response choices that the 

hnJuistic cues t s“'>)ect will be attuned to these para- 

h"s cxncnm^tT T ’ *em to influence 

and probibu an p t 'bat one of these considerations, 

“1 to pcLu" ‘"'P'"'™' the subject has 

libel) to form Judgmen'rToT/ th ""h'" experimenter is 

orattnctiicncss ^ ‘ ” subjects psychological adequacy 

ot moiliauon such melncllJ I““'hiluy that still other modes 

expcclancy to the subject N\T,at ts c/ conveying the expenmenter s 

indirect commimicilion was open m th”*^ additional channel of 

dillcrenccs bchsecn the three separate **1= only expenenced 

■expenmenter" h> m their difccntial nanl! "‘*'’'"5' t^onhabuted by any single 
tins connccUon is this further fact ^™'‘"S>“stic emphases Also relevant m 

expenmenter expeetane, effect under both the '"“S"""* “f 

Sion eondiuons vias as great, or greater ttern . J"* evaluation apprehen- 
concemed mth this t)pc of bias This sIronrK '’“amed ,n most expenments 

emphasis is the tnata ,f not the onlT paral.ngnishc 

the expenmenter s expectancy istansmittedS sub ™t® 
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VI. SOME KECOMMENDATIONS AND OPEN ISSUES 


In bringing to conclusion a chapter that has already probabl) taxed 
the reader by its length and detail I shall resist the temptation to elabo- 
rate further upon my basic argument and its supporting evidence By 
way of general summary it shall suflBce to say that I have tned to ex 
plicate a conceptualization of the evaluation apprehension process and 
that all of the present studies appear to show that this process does 
induce systematic bias in experimental responding Some of the research 
studies that have been reviewed have served an additional purpose 
they have delineated and exammed certain variables that appear to 
facilitate or restrict the operation of the evaluation apprehension biasing 
process On the basis of the present studies I think it reasonable to 
put the seal of provisional validation (and the judgment tliat they are 
worthy of further expenmental study) upon the following propositions 


The biasing influence of evaluation apprehension upon response data 
Will be reduced if those data are collected by an experimenter other 
than the one whose evaluative judgment was the onginal focus of the 
subject’s concern 


When a response pattern cued as hkely to bring positive evaluation 
IS also countemormahve, subjects high on the need for approval will 
he more hkely to produce that response pattern than subjects low on 
the need for approval 

The availability of continuous feedback about the quality of a subject s 
performance will facilitate his shapmg that performance in the direction 
he thinks hkely to earn him a favorable evaluation from the 

experimenter 

The less eflortful the response direction that has been cued as hkel) 
to bring positive evaluation, the more wll the subject go in t a 

direction 


^Vhen the experimenter is perceived by the subject as having pouer 
^^'cr him (in the sense of controlling his access to some goa region 
Or activity) this will foster the biasing of his responses m cued clircc- 
and this will be particularly likely in the absence of otlier con 
ons that directly arouse evaluation apprehension 


)^en the subject expects that a particular type of jiidgmcnnl 
cam him positive evaluation from the cxpenmenler, an " , 
*>Tc of response is also counlemormaln e or unpracticcd, us • P 
It Will be facilitated by clanty in the stimuli to c jn gc 
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Still Other propositions supported or suggested by the research re- 
viewed in this chapter could be summanzed But the past work is pro- 
logue to present and future concerns Thus, my main purpose in this 
concluding section will be to address some interesting implications and 
open issues that seem to be suggested by the studies that have been 
reported here 

The first of these is the question of whether one can draw from the 
present research and analysis any clear prescription concerning the con- 
duct of psychological and related forms of research A number of fairly 
obvious recommendations do come easily to mind One of these I have 
a eady suggested tlie altered replication approach does seem to afford 
a way o testing reinterpretations of experiments whenever it is suspected 
at t e original data were influenced by inadvertent arousal of evalua- 
tion ap^ehension This strategy can and should be more \vjdely em- 
ployed Disputatious reinterpretation of the other man’s research is easier 
legitimates the former and assesses 
possible, these activities should be joined 
scnbef.n from our studies, de- 

prehension m which demonstrated that evaluation ap 

dotZto^hn; expectancy effect These xtud.es 

efccbLhfrJ apprehension and ite data distorting 

montal -f one defines the expen! 

details of such meli *“'’Joot Whatever the particular 

to perceive at Lst™Uvrihmex”.'hI!‘^l‘'In'“’ 

ment that his c ^ experimenter and his experi- 

uniquenl^ t i ^o much upon individuals m fheir 
thetic aspect that persons m their nonnative or nomo- 

"dry”) "tr.’ 'r'’ 

Credible messages to thj effect ® ^ ’?>' *0 experimenter 
same order, can probablv lip r-.ll. ’ 1 ooo'dental revelations of the 
pretested Undoubtedly^content 

types of subjects and'^y^ef rf e " rT'* ^e vaned with 

in handling this nrobleni ’Tenmental situations, but if interest 

-a technololy of TaWPon ^ ““n develop 

contnbutofyirtlrtnn.^^^^^^ oontrol that would. I thmk" 

psjchological research ^ S e quality and trustworthiness of 

At the same time it woud probablv be 

the experiment seem so empty of purpose or against making 

motiv-ation to remain ps>chologicalIy involved destroy the subjects 

validation of its products) v.iU be required in tb"f" Clearly, some art (and some 
for LmiUng and reduang evaluation apprehension* development of techniques 
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However such a change in standard procedure would raise an impor- 
tant problem concerning an aspect of expenmental method that has, 
m recent years, become quite ntualized Should the postexpenmental 
“debriefing” include an explanation of the evaluation apprehension prob 
lem and of the way it was brought under control? Recently, some com- 
mentators have argued that debnefing should not be conducted unless 
it IS required to reduce anxieties or ego injunes directly due to the 
expenment, and unless it is also clear that the debnefing itself will 
not embarrass the subject or dimmish his self esteem by demonstrating 
his gullibility 

Against these considerations I would give great weight to the notion 
that experimenters have an ethical obligation to be as frank as possible 
with their subjects, even though full, disingenuous revelation must be 
deferred until all data are collected Nor do I think that such revelation 
need be degrading Whether the subject comes out of the debriefing 
feelmg tncked, and exposed as an ^‘easy mark,” or whether he comes 
out With a sense of having parhcipated in a useful endeavor in which 
he played an important part and was honorably treated would, I think, 
depend largely upon the secret motives and visible style of the expen- 
menter Surely, as Kenneth Rmg (1968) has suggested, the 'fun and 
games’ approach to expenmental social psychology degrades subjects, 
tnviahzes research and, I would add, quite probably activates the eval- 
uation apprehension dynamic so as to induce unsuspected but sizeable 
systematic bias in resultant data 

Candid and thorough debriefing, unmarred by any proclivity towards 
gloating can do much for the expenmenter’s self image and probably 
it also serves the ennchment of the subject’s expenence and knowledge 
However it does generate a further problem — as much for expenmenters 
who may employ evaluation apprehension control procedures as for ex- 
penmenters pursuing other approaches I refer, of course, to the nsk 
^at, despite the elicitation from the subject of a pledge to say nothing 
about the expenment to other potential subjects, the vessel of secrecy 
uiay spnng leaks This, m turn, may spoil the host culture of the naive 
subject pool without the expenmcnler knowing that anything of the 
sort has happened It is my impression that the pledge to postexpcn- 
uiental secrecy is usually internalized when a bond of mutual trust Ins 
been woven, and I am not aware of anything that uorks better to insure 
that bond than full and candid postexpenmental debnefing 

Furthermore, the postexpenmental discussion that can be opened up 
by mutual debnefing tends to free the subject to rc\eal much of his 
O'vn recent, subjective expenence in the expenmental situation Tlic 
•nformabon thereby gleaned can be of considerable help in determining 
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uhether evaluation apprehension or oUier contaminating processes may 
have been operating dunng the experimental transaction Such discussion 
also provides the experimenter with a fairly comfortable occasion for 
asking whether ‘you had heard anything about this experiment from 
a previous subject, just as it provides a facilitating context in which 
subjects are likely to respond to that query with candor 
I am aware of course, that I am dealing here in lore and impressions 
Clearly, more systematic research is required on the effects of the de 
strategy upon the subjects self-esteem, upon his maintenance 
the secrecy pledge and, for that matter, upon the value of his mtro 
pections about his experiences m the experiment But until such a body 
to ™dertaken and reported I do think it reasonable 

of all f ^ general standard favoring postexpenmental revelation 
include d,F experimental purposes, and this should 

e Suer h r J WrehenLn problem and of the 

Sion Amons vanom mojor matter that requires some discus- 

that any thouohifnl r heretofore unattended is a question 

worked through these ‘^“"ceived as he has 
approval, and his mdcrmA^i. to win the experimenter’s 

only motive of interpewonal ref ^ psychologically adequate, the 
theespenmentaltraSaction? activated in the subject dunng 

range of conjecturaTsVn ™ natt" f reduce our 

affect the subject's way of ? ™°*‘''ational arousal that directly 

he responds in the latter’s p ^ experimenter (and, thus, how 
possibilities come easily to a number of other 

common than the process urmn ,.i. are probably far less 

do require discussion At least on chapter has focussed they 

cesses is quite familiar to .all nsvph additional data biasing pro- 

expenence and scnsilivity .j., P /™“'°g'aal expenmenters of sufficient 
time (and many subiects wKp ^ some subjects who most of the 
to confound the expcnmenter to'It'^ ®f the time) are hkely to want 
expectations, to violate what' thp what they perceive as his 

h)'pothesis ^ construe as his apparent scientific 

I am mindful that this observat 

rated bj Martin Ome (1962) the view elabo- 

is a “demand characteristic” to which”* k ^ '^xpenmenter’s hypothesis 
their role, are prone to vield This nature of 

argue later in tins section, when it doo "^PPcn though, as I shall 
by a general role-bascd standard of coorJJ! probably mediated less 
tion apprehension d)’namic P^mtiveness than by the evalua- 
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Under what sorts of circumstances, or with what kinds of persons, 
does the opposite tend to occur, i e what accounts for the not uncommon 
instance in which the subjects purpose seems to be to ‘screw up the 
works’ ^ 

With an ease and haste that may bespeak defensiveness, psychologists 
are often prone to interpret such behavior as due to general hostility, 
to character based ‘ anality,’ or to lingering reverberations of the oedipal 
revolt against authority figures Such may indeed be the case with occa- 
sional subjects But another dynamic process seems to me to be far 
more common Evaluation apprehension, when strongl) expenenced, 
may sometimes generate a sort of reactive anger toward the expen 
menter, or it may be so intolerable as to require immediate ‘distancing 
from the expenmental situation and its evaluative implications Either 
of these purposes, and yet other comparably defense e ones, can be 
served by tummg the tables on the expenmenter and giving indirect 
expression to a negative evaluation of him Given the constraints of 
the usual expenmental situation, the most effective way of doing this 
may often be to disrupt the expenmenter’s enterpnse by emittmg just 
those responses which will, as the subject sees it, confound or disappoint 
him Also if this can be done with a “light” style, with some visibly 
amused irresponsibihfy, a further defensive stratagem is brought into 
operation The subject may then be able to beheve that he has destroyed 
the evaluative significance of the expenmental transaction, for, if he 
IS clearly not taking the situation senously his behavior cannot be mean- 
ingfully interpreted as saying much about his true psychological nature 
or competence 

From the viewpoint of the expenmenter, the problem posed by this 
sort of process is not so much tliat it may occur as that it may not 
be easily or reliably discerned While skillful postexpcnmcntal inquiry 
may be of some use in reducing this problem, there is, I think, Tnotlier 
important alternative There may well be some pcrsomlitj patterns and 
some foci of regnant conflict that tend to heighten the likelihood that 
subjects Will take recourse to the “confound the expenmenter stralcg) 
*1^0 question begs for early investigation and psychologists interested 
m the social psychology of the expenment will need to turn their in\csti 
gatuc skills in this direction 

Equally compelling and probably e\cn more rcadil) open to s)*s- 
teimtic mvestigahon, is the question of wlnt atlnbutcs of the expen 
tenter and of his instructions and prclimimry cxphmtions work toward 
tJie same effect It is my untested impression tint experimenters wlio 
^ perceived hy subjects as rather severe and imreveahng while, at 
tlic same time, intnisivcl> “nos>," are the ones most lAelv to arouse 
special data biasing patterns of resistance m some of their subjects 
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And obviously, it could be hypothesized also that the same is true of 
experiments that are perceived (or misperceived ) as probmg too deeply 
into anxiety laden or low self esteem areas of the pnvate self 
At least one other rather obvious bias inducing pattern requires discus 
sion m the present context even though, as far as I know, it has not 
been submitted to any systematic study whatever I have in mind the 
occasional sounding of the crj for help by a genuinely troubled or 
unhappy subject who thinks he ought to be, but presently is not, a 
patient 


Undoubtedly, this is far less common than the aspiration to appear 
normal and win a positive evaluation, but just how uncommon it is 
I do not know From my own experience and that of colleagues with 
whom I have discussed the matter I would hazard this judgment with 
some small number of undergraduate subjects (and, perhaps, most often 
with freshmen at times of situational stress) contact with a ‘ psychologist” 
does activate the regressive longing for some show of support and sym 
^ ° “““passionate parent surrogate 

in^Tau' The most important, 

there "'«*<idologicaI concern, is that out of this background 

with which th ° responding opposite to that 

but fust as trouble 

rntal s Ltion mvolvement in the experi- 

Tubtct Sme "> the -cry for help” the 

subjects and either wuh"'* '^"t““tional cues as are available to other 

his expenmental responding To artVSeTr^fu T '’’T 

to expenmental test *0 hypotheses that are being put 

scrubny through further rS'frch A Prohlem to systematic 

such as the picture rating task used expenmental situation 

employed, and to it there” can be eii u j'"?’’ “an be 

"normal’ and abnormal response natT'''^'^ 0“^'^ “*“ar implications of 
“abnormal direction could be^kenL deflection in the 

negative or ’needful self representata 
could be examined againstfioordmate vanal''“‘“‘™' ” 
m systematic manipulations of the experiment'™ P®™ "'‘h‘y '"^ices 
scnpt, and pnor inductions of psychSogical stresf O t 
research program there would prZbly emerge a ? 1 
stnctures that would help to further redul®^ m l “a“‘'“°a-y 
bias in psychological and Ldred tvne- * . P™’’'®” systematic 
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My purpose in these last few pages has been to note that, in addition 
to the subject’s striving toward positive self-representation as a way 
of reducing evaluation apprehension, there are some other, related trends 
which may also induce systematic bias in response data Returning to 
our mam focus upon the former process, I should now like to address 
an issue that has haunted the discussion at a number of points but 
has not yet been fully confronted What is the relationship between 
the evaluation apprehension dynamic and such other sources of sys- 
tematic bias as the experimenter expectancy and demand characteristic 
processes'^ 

The answer that I think most acceptable, though only in a provisional 
way, is aheady imphcit m my earlier discussion of the two expenments 
m which we found that the activation of evaluation apprehension facih- 
tated, and its reduction obliterated, Rosenthal’s experimenter expectancy 
effect 

In their separate research and dieonzmg Rosenthal and, to a lesser 
degree, Ome have both emphasized the experimenter side of the expen- 
menter-subj’ect interaction that is, they have delineated and demon- 
strated that experimenters do indirectly reveal what sorts of responses 
they would welcome from their subjects and they have also shown that 
this does, somehow, affect the responses of those subjects However 
they have had far less to say about the subject's side of the transaction, 
about the patterns of concern, apprehension, and ego defensiveness 
which move him toward acting out, or at least coordmabng to, the 
experimenter’s implicit demands 

It IS, of course, true that Rosenthal has addressed this issue m some 
of his fascmatmg side excursions (Rosenthal, 1966) into the personality 
attributes of comparatively biasable and unbiasable subjects But what 
has been required as well is a narrower or more process-oriented focus 
opon the actual psychological events that carry the subject through 
fhe experiment and up to the point at which he “delivers the elicited 
gift of his responses ® 

^Vhile it has been insufficiently developed, this sort of concern has not Iiocn 
totally Ignored dunng the short penod m which the soaal psjcholog) of the experi- 
*aent has commanded intellectual interest Some usefullj prosocalive beginnings 
>n this direcbon were elaborated by Riccken in his seminal article (19G2). and 
Ome, despite the expenmenler oriented nature of the demand cliaractcrislic concept, 
w also been somewhat sensitive to these matters 

Howc\er, while the focus upon subject processes has not been totally absent 
Jn earlier speculabvc waating, it has Jagged m dc\eIopmcnt Perhaps this Is due 
*0 Its having been obscured b> the deserved figural prominence of the work on 
J^nmenter expectancy and demand phenomena TJic proper correclbc bes not 
n abandoning the latter interest but m rcstonng and expanding our concern with 
tne former 
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And obviously, it could be hypothesized also that the same is true of 
experiments that are perceived (or misperceived) as probing too deeply 
into anxiety laden or low self esteem areas of the private self 
At least one other rather obvious bias inducing pattern requires discus* 
Sion m the present context even though, as far as I know, it has not 
been submitted to any systematic study whatever I have in mind the 
occasional sounding of die ciy for help’ by a genuinely troubled or 
unhappy subject who thinks he ought to be, but presently is not, a 
patient 


Undoubtedly, this is far less common than the aspiration to appear 
normal and win a positive evaluabon, but just how uncommon it is 
I do not know From my own experience and that of colleagues with 
whom I have discussed Ae matter I would hazard this judgment with 
some small number of undergraduate subjects (and, perhaps, most often 
wi freshmen at times of situational stress) contact with a “psychologist” 
does acbvate the regressive longing for some show of support and sym- 
pathy from a wise compassionate parent surrogate 

mil 1 f problems anse The most important, 
in ae light of our methodological concern, is that out of this background 
with^ ^ a pattern of expenmental responding opposite to that 

mil 1 a'! concerned, but just as trouble- 
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bias .n psychological and Ldredt;”:?"^.^ “ 
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However, whether this or some other private purpose animates the 
typical subject is of less importance for the moment than the altered 
perspective that is opened to us when we lay basic stress upon the 
subject as seeker. From this emphasis there follows the necessaiy recog- 
nition that even when there is no direct cueing conveyed through the 
experimenters behavior, the subject may be prone to construct some 
personal interpretation of the “true meaning” of the experiment. More 
often than not, he will speculatively examine the instructions he has 
received, the overall rationale that has been provided, the procedures 
and measuring devices to which he has been exposed; and out of the 
questions these raise for him and the hints they convey to him he will, 
if at all possible, draw some meaning, some guiding hypothesis about 
what is really being investigated and how he can best display himself 
to the investigator. 

In this view, then, the experimental situation and, for that matter 
nonexperimental research situations as well, can activate the subject 
to search for their meaning. Whether the meaning found is often focused 
upon the evaluation theme, as I have argued, or upon yet other themes, 
there ensues a consequence as intellectually fascinating as it is methodo- 
logically troublesome. The subjects final “definition of the situation” 
'vill affect his responding and thus will be reflected in the dependent 
variable data. 

To turn again to the problem of improving research procedures, the 
foregoing argument clearly suggests a further caution. The danger of 
inadvertent systematic bias in response data cannot be fully reduced 
by effective elimination of the experimenter expectancy' and demand 
characteristic problems. We must remain sensitive to the possibility that 
the subject, no matter how acquiescent or calm he appears, may be 
actively processing his impressions tovv'ard the dev'elopmcnt of some 
Interpretive hypothesis, one that will lead him to adopt a response strat- 
egy that may distort the resulting data. 

An analogue for this whole process is provided by the larger number 
of our present studies, excepting those focused upon the experimenter 
e^ectancy phenomenon. In the former group of studies the s^'slcmatic 
biasing of the subject’s response patterns was not demonstrably due 
in any intraexperimenter or interexperimenter variations in behavioral 
stjle. Rather, the differences in subjects’ performances could be directly 
traced to the fact that the preparatory materials they read contained 
bints that they could then ratlier easily shape into hj-potheses about 
purpose, or the indirect rcv'clalory' significance, of the experiment. 

In substanUve research focused upon other psv'chologic.al issues and 
*^nducled by experimenters who do not intend riicir cxperimcnla! procc- 
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The evaluation apprehension process as defined m this chapter and 
as exemplified in our various studies appears now to be an important 
part of the subject side of the total experimental transaction In the 
emerging general theory of the ‘social psychology of the expenmen t” 
it does not replace the account of experimenter expectancy effects devel- 
oped by Rosenthal Rather it extends it and perhaps also deepens that 
account by adding further clarity about the conditions under which 
expenmenter bias is hkely to be induced As regards the demand charac- 
tensbe proce'is posited by Ome, the present approach does inevitably 
raise some difficulties and disposes me toward one note of disagreement 
This concerns the mohvational perceptual pattern which facilitates the 
subject s yielding to the experimenter’s scientific hypothesis ” Where 
the experimenters true hypothesis is clear to the subject (and I would 
think that usually it is not) yielding to it would most hkely be mediated 
y ffie expectabon that this will somehow bring approval or other im- 
mediate social rewards from the expenmenter To be sure Ome might 
e interpreted as saying that positive self evaluation is being sought 
W parbcularly m that he may take pleasure m viewing 

^^ommodating and helpful person But the present studies, 
fl96S^ pertinent one by Segall, Aronson, and Van Hoose 

uyes), suggest that evaluabon * 


Tn#»nror . , apprehension focused upon the expen- 

Thus I wm M ^ pattern of subject sensibvity 

the exDenmll^r^'^V^u ^>^***^"*5 subject’s readiness to help 

instrumental I? point, if expenenced at all, is an 

menter ludpes reassuring evidence that the expen 

My -- auracbvely “normar peLn 

adds to the m^rp A ’ evaluabon apprehension 

to vanabihty m fce exoerimpnt be traced 

which mav he due tn c behavior it directs us toward those 

mem Itself ^ highlights and ambiguities of the expen- 

While they do not logically reauire ,i il. it, 

ones sometimes tend to view the f experimenter-onented the 
recipient of imphcit “messages’ „r ’“ul? 

would suggest that where such cues are ah i This 

temahe bias would be unlikely to occur T„ ^ imperceptible sys- 
theoiy of the expenmeutal trausact.o“™L * h"’ " 

thing from the experimental expeneucT T as seekmg some- 

that "something" i the expeniXters J theoretical view 

jeefs psychological adequacy and on flis Ws” th 

or enhancement of the subjert s self es^m maintenance 



CONDraONS & CONSEQULNCES OF EVALUATION APPREHENSION 347 

within a total experimental system, a component that by processing 
imputs into outputs, somehow automatically reveals immutable psycho 
logical laws 

Having said this much I must hasten to add that I do beheve that 
such laws exist in nature, and that the experimental method has been 
and Will remain essential to the task of apprehending and confirming 
them 

Those psychologists who have responded to recent research on the 
social psychology of the experiment with despair over the prospects 
of the expenmental method itself are, I think, guilty of unjustifiable 
reactive depression and are casting out the baby with the bath When 
they call for renewed recourse to ‘field studies,’ to ‘natural observation’ 
With “non-reactive measures,” and to phenomenological inquiry they 
are domg the behavioral disciplines a useful service Those ways of 
gathenng data (though equally open to systematic bias effects) can 
do a great deal to enneh inquiry into the regulanbes that govern man s 
psychological development and his functioning m relation to the persons 
and mstitutions that define his existence 
However, when such cntics suggest that the expenmental God is dead, 
they appear to have missed the pomt imphcit in all research on the 
social psychology of the expenment That pomt is that the expenmental 
method can readily be used to perfect, or at least to significantly im- 
prove, itself Any expenmental demonstration of some source of sys 
tematic bias and of the process by which it operates immediately sug- 
gests procedures for the control and ehmination of that source of bias 
Another heartening consideration is, simply, that on the basis of present 
knowledge a great deal is already known about how to reduce the dan 
gers of contammation and systematic bias Such knowledge can also 
mform the cntical evaluation of the worth of particular expenments as 
fhese are reported The wheat, then, can even now often be separated 
from the chaff — and the yield is not a grossly unfavorable one 
A truly excihng and optirmstic prospect has been opened b) a decade 
of work on the social psychology of the expenment and I hope that 
has been further advanced by the present inquiry mto the evaluation 
Apprehension process We are approaching tlie point at which we ma) 
Achieve a practical (if not philosophically perfected) solution to the 
^^sic epistemological problem of detaching the knower from the 
known, of allowing the order inherent in behavaoral and social processes 
lo tell us Its own true stoiy without any distortion due to promptings 
hstener or failings of his listening device 
The vclocit) of further advance toward the improvement of both ex- 
P<:rimcntal and nonexpenmcntal investigative procedures is hkel) to in 
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dures to induce systematic bias, die suspicions aroused and the hmts 
conveyed by the instructions, manipulations, and measures may be of 
more obscure ongm and less certain import Yet “seeking” subjects are 
prone to pick up whatever cues may be available in the structural and 
procedural detail of the expenment itself 
The more figural and prominent are the cues of this type, the more 
likely that separate subjects will come to the same or similar mterpretive 
hypoAeses about how to assure positive evaluation for themselves, or, 
for that matter, about how to reach still other social goals that they 
may e see ng In consequence, it will be more likely that a systematic 
bias in one or another response direction will result In contrast, the 
mme o scure an the more numerous such provocations toward suspicion 
"'“f comparatively 

foster ^random” 

™ increase in the possibility that 
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information seehino^ambinriT”^'' subjects engage in active 

pretive hypotheses^’ seduction and the development of inter- 

(■rwrth ru^moslTelt®^ ■" r** f“” “consciousness” 

think more likety, with mtersubiecTa d""* ’^'■ocmahve clarity) or, as I 
tion, effort, and attentivene^ * mtrasubject variability in motiva- 

our research interest m important is that we translate 

that will make possible their^’^^Ser'” 'J"osh“ns 

most useful focus of the reauirpi/1 f lu *”''®stigation In my view the 
ask just what vanables dei^rm “J^ber research effort would be to 

fomulating hypotheses, and whaTolherv‘^u'‘°'^n“''f'’''*" 

and certainty of those hynotheses .n o™*'*cs influence the content 
formed into actual, data-yieldinv which they are trans- 

Equally important, of course ■ r 

the hkehhood that such activi’ties conditions which reduce 

or elimination of evaluation aoorehen.„„ / ^ reduction 

penraent so that evaluation apprehens.n° '‘mcturing of an ex- 

one such important condition But th " “ises) appears to be 

discovery would be a great boon to probably others and their 

Until all these mattere have Ivon ^ whole experimental enterprise 
research it is necessary that expenmmT^ through further 

of the “average” subject as a passive and ‘T 

V nd patient human component 



CONDITIONS & CONSEQUENCES OF E^'ALUATION APPREHENSION 349 


Ome, M On the social psychology of tfie psychological experiment with particular 
reference to demand characteristics and their impLcation American Psuclioh^ist 
1962, 17, 776-783 ^ 

Rieclcen, H W A program for researdi on experiments in social ps)cholog^ In 
N F Washbume (Ed ), Decisions, Values and Groups \ol 2 New \ork 
Pergamon Press, 1962 

Ring, K Experimental social psychology »;ome sober questions about some fniolous 
values Journal of Experimental Social Psychology 1967, 2, 1 13-123 

Rosenberg, M J Cogmhve structure and attitudinal affect Journal of Abnormal 
and Social Psychology, 1956, 53, 367-372 

Rosenberg, M j An analysis of affective-cogruhve consistency In Rosenberg M J , 
Hovland, C I et al , Attitude Organizatton and Change New Haven It ale 
University Press, 1960 (a) 

Rosenberg M J Cogmtive reorganization in response to the hypnotic reversal of 
attitudinal affect Journal of Personality, 1960, 28, 39-63 (b) 

Rosenberg, M J When dissonance fails on eliminating evaluation apprehension 
from attitude measurement Journal of Personahty and Social Psychology, 1965 
1, 18-42 

Rosenberg, M J Some limits of dissonance toward a differentiated view of counter 
attitudinal performance In S Feldman (Ed ) Cognitive Consistency Ntw ’kork 
Academic Press, 1966 

Rosenberg, M J Hedonism, mauthentiaty, and other goads toward expansion of 
a consistency theory /n R P Abelson, E Aronson, W J \fcGuirt, T M New- 
comb, M J Rosenberg, and P H Tannenbaum (Ed ) Theories of Cognitive 
Consistency A Sourcebook Chicago Rand McNally, 1968 

Rosenberg, M J, Hovland, C I, McGuire W J, Abelson R P, and Brehm 
J W Attitude Organization and Change New Haven Yale University Press, 
1960 ^ 

Rosenthal, R Experimenter Effects m Behavioral Research New York Appleton 
Century Crofts, 1966 

Sjgall. H, Aronson, E, and Van Hoose, T The Cooperative Sub}ect Myth or 
Reality^ Dept of Psychology, University of Texas I9G8 (mimeographed) 

Silverman, I Role related behavior of subjects in laboratorv studies of attitude 

change Journal o/ Personahty and Social PsycMogy, 1968 8,313-^8 

Silverman, I , and Regula, C R Evaluation apprehension demand characteristics 
and the effects of distraction on pcrsiiasibility loumal of Social Psychology 

1968, 75, 273-281 



348 


MILTON J ROSENBERG 


crease as research on the social psychology of social inquiry is vigorously 
prosecuted And if, on occasion, one is troubled by the ostensible paradox 
that the processes inducing systematic bias may operate in our very 
investigations of systematic bias, there are at least two types of reas- 
surance a\ailable The lesser one is that every investigation in this realm 
profits the succeeding one, error should fall away as we continue to 
zero m” toward the goal of bias free research The greater reassurance 
IS that paradox itself is a goad toward intellectual and scientific adven- 
turousness, the more closed off and ostensibly circular the problem, 
the more deserving it is of assault and solution 
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L LOGIC OF INFERENCE 


If we had remained with the definibonal operationism of our recent 
past, we would not have known the problems with which this volume 
deals Our experimental setups and our measurement procedures would 
have been treated as definitional of our theoretical concepts Conceptual 
them as definitions would have excluded recognizing them as 
i^sed, as systematically imperfect as well as randomly errorful 
Definitional operationism did indeed lull some into an uncritical com- 
placency and reification of test scores, but fortunately the major practi 
boners of science had too little contact with or too httle faith in philoso- 
P% of science to be misled While logical positivists were defining 
intelligence in terms of the Stanford Binet. 1916 edition, Terman was al- 
ready initiating revisions designed to make it a less biased an more 
^ccurate measure of intelligence, a goal which clearly shoved that or 
his test was not the definition Similarly, ever ph>sicist working 
'^di a measurement device such as the galvanometer knows tint in 


“ Supported in part by National Science Foundation Grant GS1309\ 

^tten while Fulbnght Lecturer in Social Ps>chology at the Unncm> 

I am indebted to my host Michael Arg>le both for generous hospitaLt> 

""'i for help wth this paper 
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The invalidity comes from the existence of the cross hatched area, i e , 
other possible explanations for B, C, and D bemg observed But tlie 
syllogism IS not useless If observabons inconsistent \ ith B, C, and D 
are found, these vahdly reject the truth of Newtons theory A The 
argument is thus highly relevant to a winnoumg process, m ^^hlch pre 
dicbons and observabons serve to weed out the most inadequate the 
ones Furthermore, if the predicbons are confirmed the theory remains 
one of the possible true explanabons This asymmetry beh^een logicalK 
valid rejecbon and logically mconcIusi\e confirmabon is the mam thrust 
of Poppers emphasis on falsifiabihty 
The truism is now safely ensconced m elementary presentations of 
inducfave logic, without the necessity of citabons to Popper (eg, 
Hempel, 1966, Salmon, 1963) There is another decision locus m tlie 
process upon which Popper’s cnbcs ha\e focused do the observabons 
oa fact confirm the predicbons It was assumed m the aboxe that this 
decision could be and had been made At this level, falsifiabihty and 
confirmabihty are more logically symmetncal And at this level, obser\T 
tions always falsify a quantified prediction if earned out with sufficient 
precision At this level the tolerance for accuracy which scientists actu 
^Uy allow is a social system funebon, determined by the degree of dc\el 
opment of the science, the degree of expenmental conbol achieved 
^d the sharpness of compefaUon from other theones Thus for Einsteins 
Predicbon of the bendmg of starlight passmg the sun. as m the 1919 
^upse, a predicted value of 1 745 seconds of arc has been “confirmed 
°y values of 1 61", 1 98" 1 72". 2 2" and 2 0" 

Let us look in more detail at the Euler circles and the relation of 
confirmed predicbons to the truth or credibility of a thcoi^ It is our 
inescapable predicament that we cannot pro\c the theor) mus 
Within the hmitabons there diagrammed Mint «c as seen tints 
°o IS to try in some pracUcal wa> to 'empty" die eross hatched area 
*0 mahe it as small as possible We do this b> expanding as 
2 P^sible the number, range, and precision of confirmed pre ic i 
The larger and more prectse the set, the fexxer possible altcmatnc singu 



352 


DONALD T CAMPBELL 


practice it fails of perfect reflection of electncal potential differences be- 
cause of the effects of gravity, friction, inertia, field forces, etc (eg, 
Wilson 1952) While compensated and corrected design may minimize 
tliese sources of error, on theoretical grounds the galvanometer is known 
to be subject to systematic biases, the elucidation of which is itself a 
history of cumulative scientific achievement rather than of logical 
revelation 

If defimtional operationism and other accoutrements of logical posi- 
ti\'ism now are recognized as misleading, how are we to understand 
our predicament as knowers, and in such a way as to make philosophical 
sense out of the prototypic activities of this book? For me, the orientation 
of Karl Popper (1959, 1963, Campbell, m press) and that partial com- 
mon denominator shared with Polanyi (1958), Toulmin (1953, 1961), 
Kuhn (1962), and Quine (1953) although they might be the last to 
acknowledge any such, seems most appropriate I shall try to present 
unor^odox albeit through metaphors that are perhaps 

Slucbr of 

pro”^; 11 ll general.zat.ops are not logioally 

arereteLdlni* P'’>>°sophers take this to be 

lone to continffe^i-^r"'** statement of the inappropriateness of analytic 

proven thev akn la«i. * ^ scientific truths logically un- 

scientific, or iniplicativTCThlv'’aT 

best of theories if not ‘ r "estabhshed ” The 

Logie rrelevlTto^^^^ "" '■=“' ““^berated ” 

induction" can be expressed ^e situation The "scandal of 
invalid logical argument mal ^ "“‘“"g ‘bat science makes use of an 
or of "affixing the ' undistributed middle,” 

not useless ®“t "'bde invalid, the argument is 

The logical argument of science has thn. form 

tidTha4*pe™d^ B, fte ’p* h"oS“''^ h'" 

tory of a cannonball form D 
Observation confirms B, C, & D 
Therefore Newton’s theory’A is true 

We can see the fallacy of this arEument It,, 
diagram S“‘nent by viewing it as an Euler 
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enumerable in advance, or at all, and since these are usually quite par- 
ticular and require quite unique modes of elimination, this is inevitably 
a rather unsatisfactory and inconclusive procedure. But the logical analy- 
sis of our predicament as scientific knowers, from Hume to Popper, 
convinces us that this is the best we can do, that this is our labor 
of Hercules, if not our task of Sisyphus. 


n. METHODOLOGY AND PLAUSIBLE RIVAL HYPOTHESES 

Many of the potentially invalidating plausible rival hypotheses come 
from other well developed theories. These are not our particular concern 
here. Other plausible rival hypotheses are concomitants of the specific 
apparatus of the experiment. These become the subject of “method- 
ology. The hypotheses involved are often veiy specific, unelaboratcd, 
and unintegrated with wide-ranging theory. Nonetheless, they are hy- 
pothesized empirical laws relating two or more variables. To the cvlcnt 
tfiat they become methodological essentials, they arc well-established, 
and hence plausible, empirical regularities. This, rather than logical rc- 
quiredness, is the source of their authority. This is not to deny that 
^ey are also in some sense logical (Petrie, 1963). But there are so in- 
finitely many logically required controls that it would be impossible to 
incorporate them all. The cross-hatched area is infinite, logically. It is 
only at a practical level that we approach emptying it. 

This empirical status of methodological requirements seems to me 
an important point too little noted. Too frequently we leach scientific 
method as though it were a dispensation from logic, prior to and externa 
lo science. In an early advocacy of experimental designs lacking pretests 
(Campbell, 1957, 302), I wrote as though it were a matter of obvious 
that if one had only experimented with pretested^ popu ations, 
nne had no basis for generalizing to unpretested ones. Tliis nou seems 
0 me quite WTong in emphasis. The authority came not from logic, 
nl from the empirical plausibility, the probable law-hkc c larac cr, 
pretest interactions. Following logic alone, it xvas cqualh i 
ty to generalize to otlicr groups of human beings, to other 
and future, or to other settings varying in any detail from thme exp 
original experiment. In this vast army of 
ne was persuasive because of its empirical appeal. Tlie h>'po 
‘”''o^''cd, ns reviewed by Lana in this xxilume, do rio n 
nil as plausible ns they seemed then, and ns a resu , 
n«ht sec less emphasis upon experimental dwip'^ doing 
PfPtest. or will see these justified on other grounds. I urtlicrm . . 
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lar explanitions, even though this number still remains in some sense 
infinite 

More important, we m fact pay little or no attention to the mere 
logical possibility of alternate theones, to the merely logical existence 
of a cross-hatched area Toulmm has stated the pomt well 


“Again, philosophers somebmes assert that a finite set of empirical 
observations can always be explained m terms of an infinite number 
of hypotheses The basis for this remark is the simple observation that 
through any finite set of points an infinite number of mathematical curves 
can be constructed If there were no more to ‘explanation' than curve- 
fitting, this doctrine would have some bearing on scientific practice 
In fact the scientists problem is very different in an intellectual situa- 
tion which presents a variety of demands, his task is — typically — to ac- 
commodate some new discovery to his inherited ideas, without needlessly 
jeopardizing the intellectual gams of his predecessors This kind of prob- 
lem has an order of complexity quite different from that of simple curve- 
tting far from his havmg an infinite number of possibilities to choose 
^ stroke of genius for him to imagine even a single 

one’ (Toulmm, 1961. 113-115) 


It IS only when there exist actually developed alternative explanations, 
contents to the cross hatched area, that 
firm theories whose predictions have been con- 

NmvL’! developed rivals that 

such cntical^P ^ certainly true for 200 years, even by 

rLv ^ 1 The cross hatched aL was empty 

of scientific *®S'cal correctness of Hume’s analysis 

induction bv the ^ relevant problem for scientific 

forthatofEmstem'“' ° ™’'"®'l"™K>verthrow of Newton’s theory 

IS one of a competibon betrsXn develim rt™* The truer picture 

roborated theones for an overall h" “’"• 

{Campbell, 1966) superiority m pattern matching 

Thus the only process available for establishing a se.entifle theory 
IS one of ehmmatmg plausible nval hypotheses” Smee ftese are nev2 



PROSPECTIVE ARTIFACT AND CONTROL 


357 


ttfic achievement, an empirical product, not a logical dispensation They 
represent generally verified hypothetical laws — in the philosophers’ 
terms, contingent, descriptive, synthetic, and therefore corrigible 
tmths,” rather than logical or analytic truths The ‘control group’ is 
a feature we psychologists are taught as axiomatically required It is 
seldom noted that it, or its analogue, is totally missing from most of the 
19th century physics, chemistry, and physiology from which we took 
our methodological models As Bonng (1954) documents, it was invented 
as recently as 1907 to control for a plausible rival hypothesis quite spe 
cific to psychology, namely that pretests would produce gams in per- 
formance even in the absence of experimental treatments (tliat is what 
we can call a main effect of testing in contrast to the interaction effect 
of testing described above) Were it regularly to be found that there 
were no practice effects, this reason for the control group would be 
eliminated This is not likely to be so for much of experimental psy- 
chology, but might be for persuasion studies, as Lana has noted Our 
typical synchronous pretest-posttest-control group design also controls 
for other hypothetical nval explanations of change (Campbell, 1957) 
But for an experimental psychologist studying the learning of nonsense 
syllables, none of these are plausible enough explanations for a gam 
in memory so that he typically does without Even in the most primitive 
one group pretest-posttest design, the need for a pretest would not be 
there if u were not for the empmcal facts that test-retest correlations 
greater than zero and that individuals are not all equal 
Superimposed upon the simple control group are other require con 
h'ol groups whose empirical justification remains more obvious Thus 
in cortical ablation studies, the sham operation control group reflects 
the empmcal fact of surgical shock With accumulating evidence about 
6 nature of surgical shock, it becomes an utterly implausible exp ana 
bon for many ablation and electrode-stimulation effects, and as a resi 
« topping out of use The placebo control group reflects the xciy 
"’ell established law as to the therapeutic effects of believing one i. 
deceived a curative treatment (see Orne, Chapter 5) Where P 
eeutical research is being tested in terms of the very genera 

illness-health, it still remains essential For much more spcci ^ ’ 

f 'an often be skipped The double blind placebo control g™“P 
™ further empirical law of the effect of experimenter faith ‘ 

^■n^termg the p.ll, and gets us rnto the realm of facts ub.eh Itoscntinl 

as so Well documented ( 1966, and Chapter 6) nwclio 

Amamr, ...X to the science of ps)clio 

control groups 
Trom Hosintlnl 


U in which this volume contnbutes — ,Tmiins 

'“g>'al method rs thus m estabhsh.ng the need for non controj grou^ 
Omf* «,1 .1 


Ome come “demand character” control groups 
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interaction effects are present, they are of a dampening nature, leading 
to an underestimation of the other law under investigation They are 
not of the sensitization tvpe that would produce pseudo effects utterly 
untypical of the natural situation 

In this case, the “plausibility” of the nval hypothesis as stated in 
1957 was based partly upon an appeal to common-sense knowledge 
of psychological processes However, my presentation did contain what 


now appears to have been an erroneous pseudo citation, and one of 
considerable persuasive power It will add to the manifest uniformity 
of Lanas review of empirical findings if I take space here to set the 
record straight What I reported was that in the “Cincinnati looks at 
the United Nations study, it was only the pretested panel that showed 
any aw areness of, or effects from, a very intensive public communications 
effort The study involved two parts The main one compared two sepa- 
rate but randomly equivalent samples of 1000, one taken before the 
campaign one after, finding essenUally no differences The results of 
tois had been published in the paper I cited By oral report from one of 
the authors of that paper I learned of the subsidiary study done for 
0 um las ureau of Applied Social Research, which involved remter- 
%iewmg the pretest sample By this oral report I learned, or thought I 
T outcome of that still unpublished study, and this is what 

imVrt t'h ^e^^sd a most apt illustration, just what was needed to 

of revising 

I , ■" (Sent.., Jahoda, Deutsoh, and 

nreci' f ^ great deal of effort trying to track down a more 

S sh.dv"'"'*’ 1 . 'herefore presented the 

a Poss.brUty Eventually there turned up 

t It ^ q-te different find 

unnrelcsted significant differences between the pretested and 

‘^“ds m L direction 
LnTor T n (A later pre- 

d!d not Tvo rcT *0 facts straight but 

.1 usTes hoi ’■“o P^“™h=d ) This aLcdote 

” cd ll«; hvlnll, 7' ““ upirn die hypothe- 

v7e 1 'iTolVri ^<=g“'»‘ie\ and therefore, the rele- 

vance ot cmpincal fact to the establi« 5 limAr,i- 

control One important iraplicatTaf ^ 

failure to-control m generaf thatTlib!. ^ argument is that it is not 
mnfml i.* 1 i i , US, but Only thosc failures of 

IfT' Tit esfM ‘1 "™' '■ypodio^es, laws vvith a degree 

i T "i ""’P””’’’" •» 01 exceeding that of the law 
our cxpenmcnl is designed to test ° 

•Ihns our current standards of experimental design represent a seen- 
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through a modification of the experimental group. As a control, this 
involves an a priori preference for parsimony, for it is always possible 
that two separate irrelevancies, a different one in each setting, explain 
the superficially consistent results. Again, we must disregard this possi- 
bility except insofar as specific versions are developed and plausible. 


B. Confounded Aspects of the Treatment: Interaction Effects 
There may be a genuine effect of the theoretical variable that is spe- 
cific to (or inhibited by) particular vehicular components. Again, the 
potentialities are so numerous that we pay attention only to explicitly 
elaborated and plausible hj'potheses of ^is nature. More than that, we 
^e even more likely to disregard or judge implausible an interaction 
effect than a main effect. Perhaps part of the reason for this is that main 
effects are more easily handled. Probably more important is the very gen- 
eral inferential generalization that main effects are more probable than 
interaction effects. This would be analogous to, or perhaps even a 
part of, Mills inductive presupposition that nature is orderly. Such 
n generalization might be descriptively true, and it would seem to me 
'vorth an actuarial survey of anaIyses*of-variance in Ph.D. dissertations 
(a less biased sample than published research). But even if this is not 
descriptive of nature as we find her, it is descriptive of the knowable 
^ects of nature, a biased sample upon which science and simpler 
raowledge processes necessarily focus (Campbell, in press). By Imowl- 
cdge We mean, in part, usable reidentifiable samenesses in settings that 
not identical. If the highest order interactions \^'ith the specmcs 
of space, time, and attributes are always significant, then no generaliM- 
ffon is possible, and hence no knowledge and no science. A success > 
established main effect is a much more general generalization than is 
^ interaction effect. Much of the basis for the recalibration of 
^ent dimensions is the search for that option of quantification whicti 
^ the most regularities into main effects. , , 

*val hypotheses in this class are not controlled bj the exp^ 
content control group approach, but would usually be bj c a er 
^^perimental treatment. Note that this latter as a control ere ^ 
the a priori preference for parsimony and the 
favor of main effects. For it would always be possible that the ap 
P^ent general confirmation of the law in t%vo settings was m 
^nicidence of two separate specific interaction effects. 

Background Interactions , 

^ackground here refers to those common features ***^^^|l |- 3. 

^^nmental and control groups. Incx-itably these involved man> 
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come the high expectancy and low expectancy treatment replications 
From Rosenberg come the evaluabon apprehension control groups, and 
the recommendation of experimental arrangements disguising the admin- 
istrative relation between treatment and posttest McGuire’s and Rosen- 
thal’s chapters in this volume provide confirmabon of the law-like char- 
acter of threats to validity, m shoiving how such variables can shift 
from being control problems to being focal 


ni. A 'TYPOLOGY OF ARTIFACTS, BUSES, OR THREATS 
TO VALID INFERENCE 

While threats to validity or artifacts can come from any aspect of 
the expenmental process, and while a complete typology is not possible, 
It may help to lay out some recurrent types of arbfact 

A. Confounded Aspects of the Experimental Treatment: Main Effects 

c regard as aspects of the experimental treat- 

TnnT f Ki which differ between experimental and control groups 

are nvi ^ 0 J^ase are irrelevant to the theoretical vanable we 

are arV.ih-n'* ^ “>sltumental incidentals " Such features 

are unavni^M"' “ re'S*’* *'ave been other implementations, but 

have had to h^ had ftere not been these incidentals, there would 
Each one ireatments” are possible 

of an effect A avmt details is a potenbal rival explanabon 

°t IS exemn ifiedT'n ‘XPa In this volume, 

lat on ™s nl col f claim that relevant manipn- 

more 000^ 000^ r apprehension A still 

to the clfect that diirerenh!d‘e™OTll™? experiments, 

specdlc treatment details was^thre " 

Tuo ua\'s of ’ e essenhal treabnent vanable 

emerge First, there is Z"avay‘"of"thrne'i'"'’ 
panded content control group Thm t T 

modified to include more of what IS n ‘reatment is 

the cxpenmental group The sham o P^®"nnsly only experienced by 

gronps^reofthJort Thm!aS„ 7 ro?^r " 

apprehension, or demand character 'or®^“'’ '^‘ 1 "'™'^"! evaluation 

ated, increasing the common denommato^r"'" 
trol groups no longer dilfer on this feamro Se “ „d'T™' nor' 

tumstic scsirch for new modes of implementqho l i 

aanable is exempldied iiithnut partl^S "v':? vLtbl^Iat:! 
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help show its utter unreasonableness In tlie physical sciences, the pre 
sumption that there are no interactions with time (except those of dailj, 
lunar, seasonal, and other cycles) has proved to be a reasonable one 
But for the social sciences, a consideration of the potentially relevant 
population charactenstics shows that changes over time (eg, a 30 year 
companson of college students) produce differences full) as large as 
synchronous social class and sub cultural differences To representatively 
sample from our intended universe of generalization would require 


representative samplmg m time, an obvious impossibility 
More typical of science is the case of Nicholson and Carlisle Taking 
in May, 1800, a very parochial and idiochromc sample of Soho water, 
insertmg into it a very biased sample of copper wire, into which flowed 
a very local electncal current, they obtained hydrogen gis at one elec 
trode, oxygen at the other, and unmhibitedl) generalized to all the water 
m the world for all eternity It was a hypothetical generahzation, to 
be sure, rather than a proven fact There have been by now many 
studies of the effect of ‘impunties in the water upon hydrolysis, but 
these too have been done on very biased samples The idea of a repre 
sentative sampling of all the waters of the world, or of England, never 
occurred even as an ideal The very concept of ‘impunties,’ of segregat- 
es the contents of water into the ‘pure’ stuff and the ahen contents, 
IS one which would never have emerged had a representative samphng 
approach to water been employed In the successful sciences, generaliza- 
tions have never been “inductive” m the sense of summanzing what 
bad been observed within the bounds of the generalization, but instead 
ave always been presumptive, albeit guided by pnor laws Tlie hmita 
ons to the generalization have emerged from checking in nonreprcsenta 
we Ways on an initial bold generalization Scientists assumed t at y 
0 ysis held true universally until it was sho'vn otherwise 
0 this light, had we achieved one, there would be no need to ® 
gize for a successful psychology of college sophomores, or even of iNortti 
'vestem University coeds, or of Wistar strain white rats Exciting 
powerful laws would then be presumed to hold for all men or a ^ 
fates at all times, until specific applications of that presumption pf°y 
Wc already are at this lattk stage, but even here a 
phng of species or school populations is not the ansuer ' 2 
ad, dimensional explorations, as in comparing pnmates u i e ) 
evolutionary development, are in the typical path “f , 

- would be a fine achievement, eien thong i no ^ 

pravon univeisahty, to have a lauful psjeholog) of ' ° 
r^«'>fanthal and Rosnow, Chapter 3 ) Houcier, '"a 'oo nh™ 
^ffic plausible hypothesis has been developed, predicting 
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tiires unspecified in theory probably even more numerous in the social 
scitncts than in the physical All of these are potential sources of interac- 
tion effects witli the theoretically relevant aspect of the treatment or, 
to be sure, with an irrelevant one of the aspects confounded with the 
treatment Again these are so numerous that we can only pay attention 
to the plausible and well de\ eloped nval hypotheses 

Hypotlieses of interaction in this category, in the prior category, and 
in several to come below (as on subject selection) are in some sense 
not as serious threats as those in the first category They represent only 
potential /imitations on the generality of a law already established m 
one setbng It is only when that one setting is “artificial* and when 
we are interested primarily in applymg our generalizations to other set 
tings than that artificial laboratory, that such threats to validity worry 
us The possible pretest sensitization m persuasion studies, discussed 
a ove, IS of this nature It should be recognized that an elegant science 
o persuasion restricted to pretested audiences would be a quite worthy 
scientific achievement, even if of little practical value, and that, by and 
^ge, e p ysical sciences have been preoccupied with predicting ex- 
^ v! “'‘hough to be sure, m their truly impres 

of quite 

“utunt coutrol group approach will 
rantol ^ experimental treatment providS^a general 

D Interactions svith Popnlation Characteristics 

cxpcrimLb- ( Campbd'r and ItML”' iS 

2= "IITSJ:'’*” 

pclcntnlhmitationscnftegenL|,(f„n"'^°',®™'‘P^- “ 

with a specific population We ^ f uhserved m a study done 
leading super eeo'^ideal from P^yuhology may inherit a mis 

besohed b> r:?;::ntati!^a”SnlT “’'“’‘’I 

relesanco, perhaps of all manlfnd^/o"' theoretical 

to achicscequn,aincc between quite 

groups should not be confused ,vift the TT'"'"'’ 

randomization to achieve representation emphasis upon 

[Campbell and Stanley, 19ra 23] ) specified population 

wiA Ohat'OOkmoO " - ^t of keeping 

with what we know of science that .1 should be removed even from 

our philosophy of science A consideraUon of the time dlensmn will 
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sures generates the jeopardy of discrepant results which are a great 
embarrassment to write up 

F. Confounded Aspects of Measurement: Interactions with Treatments 
Even when the measured change is due to the theoretically relevant 
aspects of the measure, the irrelevant method components can condition 
the reaction — the observed reaction thus may be specific to this particu- 
lar mode of measurement Again, control of such a plausible rival hy- 
pothesis lies in alternative measurement devices (Note that the interac- 
tion of the relevant aspects of pretest measurement with the treatment 
have been discussed above ) 


IV. CONTROLLING ARTIFACTS 


In the previous section, several distinguishable modes of control have 
been presented There are 1 Expanded-content control groups, in the 
tradition of sham operations and placebos, 2 Treatment replication with 
altered methods, and 3 Multiple methods in measurement The present 
section continues this discussion with three additional points more gen- 
eral in nature 


A* Controlling Plausible Rival Hypotheses 
rough Supplementary Variation 

This heading refers to a very general technique of partial or inferential 
control useable for many settings in which direct or complete contro 
not possible While its primary application has been in quasi expen 
^ontal settings, it is available also m expenmental ones One noteworl y 
“nplication is that clanty of inference someUmes may be improved by 
6 iberately reducing the quality of part of the data collected e us 
®gin With such an illustration . ,, 

n our study of cultural differences in susceptibility to 

et al, 1966) one of the plausible nval explanations of the diK - 
^ces obtained was in terms of vanations in the administration o 
oal tasks by the vanous anthropologists involved To ’ 

"■o made what seems to me now to have been the -amazingly bm'O 
to deliberately debase half of our best fe", 

n«n.ctions were for the test pages to be held vertically at four teet 
mm the respondents’ eyes, not The easiest position to •a^b.mc 
a person is also to record results In our Eianston ,|^ 

^^ted on a door-to-door survey sample basis, half u ere done 
d the other half were administered as in tabic top presen 
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on a generalization we very much want to make as to nonvolunteermg 
populations, we attempt to control it Not only would we want to gen- 
eralize to such nonhboratory populabons for reasons of applied science, 
we m experimental social psycholo^ also aspire on pure science grounds 
to bridging generabzations to the unavoidably nonexpenmental social 
sciences Also to be noted m the volunteer subject problem is the fact 
that the plausible mteraction hypothesis affects not just one treatment 
vanable, but a very large class of them, e g , to the effect that volunteer 
subjects will show the results they believe the experimenter wants in 
any expenment Such a hypothesis is indeed threatening enough, so 
that if empirically justified, it would make us want to shift populations 
for our basic exploratory studies 


E. Confounded Aspects of Measurement: Main Effects 

Every measunng device, like every treatment, is dimensionally com- 
p ex with many theoiehcally irrelevant vehicular components The mea- 
sure e eots o the treatment could he due to one of these irrelevancies 
nn T' ufV" g™erated a vast hterature There 

attihiHp Studies of response sets m questionnaires, 

Carnnher?" '"S- Cronbaoh, 1946 1950, Rorer, 

another vaFt \ ^ Rees, 1967) Social desirability provides 

a^e halo F" ^-bngs, there 

artifacts in scores Ff d leones of personality There are 

pattern similantv , ^ interpersonal perception, and 

hate analogous problemsTweb™ri9^r'’ 

of the in’terpretTt?o* o*f hterature in criticism 

authontariamsm haL been enhe V differences m F-Scale 

1934) or acquiescent resnimso 1 intelligence (Chnsbe, 

the authontanan compoLnt bnt'S™™* h 

studies imohing the F-Scale t., ""’"ar cnhcisms of attitude change 
reason that this large segment o7tC^ geared It is perhaps for this 
represented m this volume " research artifact literature is not 

Control for these problems tTimi.rrU 
differing m vehicular OT method commi®*' "’“'‘■pie measures 

Campbell, 1960) In most Hboratoiy^penm 
be done than usually is, and w.th^i„rch ^ a 
Uian would be imoWei m 

Probably more is done than is r^rtrf W ° 

po ed, because having multiple mea- 
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of fish, which we know well how to do If, despite the widest possible 
vanation in hunger, progressive improvement fails to appear m the fish, 
we may reject the hunger hypothesis Hypotheses about other variables 
also may be tested by systematic variation With regard to the question 
of reversal learning, I shall simply say here tliat progressive improvement 
has appeared m the rat under a wide variety of experimental condi 
tions — it IS difficult, in fact, to find a set of conditions under which 
the rat does not show improvement In the fish, by contrast, reliable 
evidence of improvement has failed to appear under a vanety of condi 
bons” (Bitterman, 1965, 396-^10) 

As apphed for the control of artifacts, two types can be distinguished 
On the one hand there is mterpolattng or bracketing variation, in which 
the supplementary variation includes the whole likely range or more 
The two illustrations from the optical illusions study are of that nature 
As controls, these assume hneanty or monotomcity of laws, i e , that 
Intermediate values would have intermediate effect This is usually a 


reasonable enough assumption to render the threat implausible if the 
extreme bracketmg values find it so 

Second, there is extrapolating variation, in which we do not have 
hill access to all values of the dimension, and to achieve our control 
*nust extrapolate outside the range of explored values to unobtainable 
''^ues The problem of volunteer respondents might be such What 
jve would like to do is to extrapolate to the nonvolunteenng populatioj 
ut in even the best we can do, some degree of volunteering is require 
degree of control is mtroduced if one adds a much more extreme y 
voluntary situation, more voluntary than would normally be used I 
biese two degrees of voluntarism show the same laws, we extrapdate, 
^suming monotomcity, to the condition of no voluntarism at al ^re 

e ^sumptions involved seem intuitively less plausible than ^ 
bracketmg case, but are still plausible enough to make such a control 
°rth addmg In this case too we have added a body of de era c y 
poorer data 

in Utilization of supplementary vanation as a 

0 common practice of classifying respondents on the asis o^ p 
ssion interviews as to degree of awareness of the etpenmen P 
& and checking the rephcation of the same laws in 
M^mra-s researcl m Z volume, has extended this by del.bernteb 
0 ucing more extreme degrees of awareness on an evpenmc 


^^teromethod Keplication , 

onl^^ ^'o^terated history in research on artifacts is for o” exciting ^ ^ 
treatment, one measure expenment to be cnticized uith spccm 
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one and one half feet from the eyes with the booklet in a horizontal 
position The latter was thought to be more slovenly than any actual 
administration, but m tlie likely direction of deviation These two condi- 
tions did produce differences, but small ones, not at all sufficient to 
explain the cultural differences which were five to seven times larger 
Our resultant power of inferences was greater than had all of the 
Evanston data been of tlie best qualit) 

Tht study ilso provides a second illustration After the major body 
of data had been collected pubhshed research appeared indicating that 
■mpection time differences were a possible plausible rival explanation 
These Me controlled’ by collecting a new Evanston sample in which 
two exposure times were used, one very brief, the other much larger 
an was i e y to ave occurred m any sample Here again, while there 

tteL,fr“XrydZ™c'r' “> “P*®” 

raihi!rnt°d control, Naroll (1962) divides ethnog- 

IctcIs of aualitv^'p"^'''™ cultural comparisons into two or more 
forto orTo ^ve 'Tfe- ^^"og'^pher hved m the area 

raphy ivould bo^ebssed a^ of h 

m witchcraft nnd nnn p quality Vanables such as belief 

correhted \vith daH nt positions m childbirth turn out to be 

better acqu'imtinpf' a ^ noted by those with 

splirrrctuon v^h' TV'^t ^ show a 

dnti qinlity Naroll n at I T introducing the variation in 

spurio'is detail " the'existence of such 

prratne ps>cholo^’, h«''a"Lrf''rt ^h'"" laboratory of com 

percept of nhich these quality variahous aroTe iCratr*”'”’”® 

I do not of course know Lrs 

tlie fish which will make arrange a set of conditions for 

to those Minch arc made upoTth^rnT? demands exactly equal 

tion Nor do I knoM Iiom to conn ' l experimental situa- 

two animals rorlumtch, howev ' ^ reward value m the 

sible, because for control bii ™'"Par'sons still are pos 

control by systematic vanaiion substitute what I call 

tint the difference bchsccn the ”’i, r '=’'a™P'c. >hc hypothesis 

a difference, not in lerminc but m Z '™' r 

implies tint there is a lescl of IninEer at ' e"®" hypothesis 

sue improxcmcnt. and put m xx-v Ash xx ill show progres- 

totest Wclnxconb to xary Icxel of hnn„ 'Wolhesis becomes easy 
y unger widely m different groups 
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lenge as it appears as a specific local anomaly in an otherwise straight 
forward scientific quest 

If we are indeed in an extremely difficult arena then there is even 
a motivational utility in the regular occurrence of exciting findings which 
later are discounted as artifacts These provide exciting rewards to the 
would-be discoverers, and exciting rewards to the successful cntics (the 
more exciting the greater the reputation of the false claims) These 
are rewards and motivation for experimental work and empincal ex’plora 
hon Both would be lost under a procedure that effectively screened 
out ovei optimistic pseudo confirmations of exciting theones 


C Disgmsed Experiments in Natural Settings 

While the formahsm of the previous section provided a useful general 
perspective on possible artifacts, it serves to fragment a central class 
of plausible nval hypotheses with which this volume deals Tins vve 
can call awareness of experimentation, or as I once labeled it m 1957, 
Teactwe arrangeTnents 


any of the experimental designs, the respondents can become 
awam that they are participating in an expenment, and this aw arcncss 
can have an interactive effect, in creating reactions to X [experimental 
reatment] which would not occur had X been encountered without this 
a guinea pig’ attitude Lazaisfeld (19-18), Ken* (1915), and Rosen 
tnal and Frank (1956), all have provided valuable discussions of this 
cm Such effects limit generalizations to respondents iiavmg tins 
a\\areness, and preclude generalization to the population encountenng 
'vith nonexpenmental attitudes The direction of the effect ma) be 
°ae of negativism, such as an unwillingness to admit to any pcrsu’Wion 
J" c ange This would be comparable to the absence of anj 
^ect from discredited communicators, as found b) Ilovland ( j 
.1 ^ result IS probably more often a cooperative responsiveness in " nc 
J respondent accepts tlic experimenters expectations ami provitics 
P cudoconfirmation Particularly is tins positive response Mel) «hei 
e rwpoii{jj,j^(^ arc self-selected seekers after the cure ^ 
surl ^"*^°rne studies (Roelhhsbcrgcr and Dickson 193 ), i 
* J)'”ipTthctic changes due to awareness of the expenmenta ion 
”0 the specific nature of X ,, 

of problem of reactive arrangcmcnls is dislnbuteil over 
f '^'P'^rimcnt which can draw the attention of the 
Pretri** ° ^^ponmcnlation and its purposes Tlic conspictioiis 
pjj P^rticuhrly vulnerable, inasmuch as it 5iginh_ 


Parnn r“'*'-unriy vuincrauic, inasniuwii — 
Po5cs of the experimenter For communications 


of obvaoudv jvn'O' 
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plausible rual hypotheses, and to be followed up by a senes of expen- 
ments with expanded control groups or changed treatment method, or 
changed measurement method, until the original finding is doubly con- 
firmed or rejected in favor of some rival mterpretation Even where 
the field is active and research is cheap, this cycle takes a good ten 
ycarb or more, as illustrated by Rosenberg s work on the dissonance 
expenments, reported m this volume Any strategy which would cut 
do\\Ti on this wasteful procedure would seem at first glance to be worth 
introducing 


If one reviews the control comments in the previous section on types 
of artifact, one notes a very general utility to varied experimental imple 
mentation Multiple methods of measurement have a parallel value There 
emerges the suggestion of routinely programming heteromethod rephea- 
tion in the initial research phase Each Ph D research would, for exam- 
ple, be required to induce the treatment variable m two methodologically 
independent ways, and for each implementabon to measure the effects 
by two independent methods If an hypothesized law was initially con- 
fiTOed in all of the four heteromethod replications thus generated, most 
of the probably plausible rival hypotheses would have been ruled out 
in advance (without having ever been explicitly formulated) If all four 
uerc not consistent, but if there were several strong effects, the candidate 
s%ould be left Mith a chaUengmg empirical puzzle upon which to work, 
temptation to over strong theoretical claims which he 
wo^d have had ,f he had only seen one part of the puzzle 
This methodological precept is. however, not recommended If I judge 
r theoretical successes and evpenmental skiUs properly, full 
he ^ never be found The process would in general 

that maTiv"'°'^° P'“®“t practice, so much more so 

mo l3,r editors would al- 

acainst niihlishme'^'^ 'T presentation, under current standards 

dards arc nrnhalif novel hypotheses, and these stan- 

mformation proees^rgTclm^belUg^g^Jetl™ Toma! sy’temTf 

r y r-t 

palor brIioTPc ilm la 1 , , IS inuch higher when each investi- 

c arc'rn:;::d:';t\:s“zrr'h™“‘“'^' 

ripfrrrv. proven a true theory Some 

m ?etosnee . n^ Tl "ccessaiy, both m anticipahon and 

in retrospect upon accomplished research So too with a Lrsnective 

thcsTt'Tn to anLpate 

Ise n^r e ■''•“'y =«pcct of all research, but instead to 

close onr eyes to their general possibility, and to regard each such chal- 



PROSPECITVE ARUFACT AND CONTROL 


369 


wth” Such expenments are best done m natural rather than laborator) 
settings, not because natural settings are more representative of the 
target of generalization, but rather because m natural settings respon 
dents do not suspect they are being expenmented with Laboratones 
in general, are perceived as just that, i e , as settmgs for expenments 
The force of the argument may be strengthened if ^^e note that most 
of the laboratory studies with dramatic “expenmental reahsm’ achieve 
this by distractmg the respondent xvith some plausible fafade or cover 
story while mtroducmg the treatment as an incidental or accidental 
event Thus French (1944) assembled groups for discussion purposes 
and then used smoke seeping under the door as an expenmental treat- 
oient Ome (Chapter 5) uses an ‘accidental' power failure, Darlc) 
and Latane (1968) an epileptic seizure For some, the real expenmcnl 
IS among the respondents waiting to serve m the expenment For the 
innumerable expenments using a confederate, the treatment is the per- 
formance of a fellow respondent The incidental fact that one expen 
roenter was Negro, the other Caucasian has been used (Rankin an 
Campbell. 1955) The respondent has frequently been led 
“at he IS the expenmenter (eg, Festingcr and Carlsmith, 19o9, i 
1963), and so forth All of these are efforts to use the mtunl 
l^pects of the setting, to evade the effects of awareness of expenmenta 
®ii The utihty of these deceptions is being lost through pu 
can be regamed for a while by moving out of the laboratory en ire > 
ccial psychology has had by now enough experience wt 
'*^ninents m natural settings to provide the basis of a imturc aw J 
methodology Webb and his associates (1966) have proiadcd wo 
“*■1 beginnings, although their focus is on measurement rad. r t^nn 
treatments Aronson and Carlsmith (1968) pron • 
of the framework The projected book by Gross, , ^jijicr 

"preparation) may fill the bill as also may Rosenblatt and 
[^ preparation), but the task has not yet been done, """ 

P or do it However, a few paragraphs and illustntions . 

‘0 the illustrahons, Uvo general issues mil be . 

Vc perspectives for evaluation , nituri! 

What are needed to socnl 

Cnrncnt 


but 


j.,. '••• •'-omonoiuf wnai arc / 

5 ^■q>enment are a natural mode of contact to 


toiits 


small 


enough and numerous enough so 


that random ns5i 


the mode of coa.Kt 

that other unit' 


to t 

oohieves cffechve equation) 

""“"Sb so that there is no '‘"'„ 5 o ai a.hhie 

'■ilhi' ^‘fferent treatments, and with a mlura .JY.ire to rvti"! 

setting releiant as a measure of effect ii'n(,«ihlr Im'* 
rv,” I”?' cannot be created at u.U f"" pnn <•. 

' ""<1 "td. all possible measures Instead, the. mu't In- oPP" 
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sive aim the experimenter’s topical intent is signaled by the X itself, if 
the communication does not seem a part of the natural environment 
E\en for the posttest only groups the occurrence of the posttest may 
create a reactive effect The respondent may say to himself, ‘Aha, now I 
see why wc got that movie' This consideration justifies the practice of 
disguising the connection between O [observation or measurement] and 
A as through having different experimental personnel involved, 
using different fa9ades separating the settings and times, and embedding 
the X relevant content of O among a disguising variety of other topics ” 
(Campbell 1957,308-309) 


Many although not all of the artifacts covered in the previous chapters 
arc subsumable under the hypothesis that the results are what they 
arc onl> because the subjects were aware that they were being experi- 
mented with, including the possibility of differential awareness on the 
part of the experimental group Omes demand characteristics (Chapter 
5) are entirely of this nature, although the placebo effects which he 
also reviews are not. inasmuch as they no doubt are also charactenstic 
of nonexponmental medical applications of drugs Pretest sensitization 
Chapter 4). had it been empirically established, would have been in 
this class, and possibly the commitment effect of pretest is also due 

be experimentally 

0 un eering for an experiment (Chapter 3) implies aware- 
ter ef IT™'"' ^ for ExpeLenter effects (Chap- 

occm w.ihr,r r'l *<= Pygmalion effects 

asnecro?,h ^ experiment) but those 

Kc ex^eum f ‘■“P™'J“t cooperation with the perceived goal 
2) n aTenrT Suspiciousness of experimenter’s inLt (Chapter 

— s -ed, and McGune’s 

rEX.:on^app'rr„:ron (S^^^ 

inlerachon rffrofc of ^ * dlustrations Rosenberg provides The 

attitude measurement ( SimpS" p effo^ at indirect 

and unobtrusive measurement (vS' ^ ibbb I f 

and Carlsmith, 1968) is to a laree (Aronson 

much of the research reported 

It IS not Tho nhorrxoH, o f »” volumc is reassunng, much of 

oL: ChIptr'Crari;ret.ri%^^ 

substitute Lnuse it carries axxareness^f e^' “"‘‘PP®'‘'‘ng as a 
The obuous cure for all these !s^T f . 

Mhioh the respondents (if not the expenmemf'i expenment m 

1 ‘'•'pvnmenters) are unaware of nartici- 

pahng in an experiment, are unaware that they are "being expenLnled 
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A separate problem is that of obtaining the respondent’s permission, 
a problem which has become of great practical importance no^v that 
half of our research support requires it Tins is obviously an impossibihty 
in the expenmental setups to be described here, if disguise and unaware 
ness IS to be maintained On the otlier hand, m those settings using 
means and ranges of communication that are within the public domain, 
and which nonexpenmenters are usmg freely without such permission, 
this becomes an utterly unreasonable requirement 
Another ethical problem is that of invasion of pnvacy This is not 
a necessary aspect of disguised naturalistic expenments, and indeed 
IS an impossibility in some Anonymity of records is an aspect of the 
problem However, when potentially embarrassing matenal is collected 
m a manner that makes possible linbng it with the persons name, the 
threat to the mvasion of pnvacy is made worse by the disguise and 
the lack of permission Injury, including humiliation and insult, is a 
problem no greater in degree than m laborator)’' expenments 
Debnefing, explaining to the respondent the true nature of the expen- 
nient, apologizing for the deception, and if possible, providing feedback 
of the results are procedures charactenstic of the self-announccd campus 
laboratory and generally omitted from the disguised field experiment 
While such debriefing has come to be a standard part of deception 
expenments in the laboratory, it has many ethical disadvantages It is 
many times more of a comfort to the expenmenter for his pain at deceiv- 
ing than to the respondent who may leam in the process of his own 
guUibihty, conformity, cruelty, or bias It provides modeling and publicit} 
for deceit and thus serves to debase language for the respondent as 
Well as for the expenmenter It reduces the credibilit) of the laboratory 
and undermines the utility of deceit m future expenments Argylc 
(1962), Milton Rokeach (personal communications), Stollak (1967), Me 
Gnire (Chapter 2). and Aronson and Carlsmith (196S) have called 
attention to these disadvantages, and thej are strong enough to justi } 
elimination of debriefing in those cases where the experimental treatment 
falls Within the range of the respondents ordinaiy experience, mere ) 
lacing an expenmental rearrangement of normal level rammunicalions 
This normal range is certainly exceeded in the Asch (1956) slu iw n 
present eight fellow Swarthmore students in solid contradiction of v% hat 
Would ordinanly hav e been a simple perceptual judgment, or I ic * ‘ 

(1963) studies m x\hich the respondent had to administer strong c ec t 
shocks to a fellow student It is probably exceeded m pcrsuasiv 


• The gleeful reporting of deception ctpcrimenls in introdiiclon 
“nd lectures is probabb^till more jmportmt in regard to th«e IiU t 
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cally hit upon Any given setting will mevitably impose great restnehonf 
on the lands of p oblems that can be studied in it These restrictions 
will be upon th^ kmds of experimental vanables that can be implemented 
and upon the modes of measurement available 
2 Deceit debriefing and other ethical issues Disguised experiments 
obviously involve deceit at some level and as McGuire (Chapter 2 
part 5) and kelman (1967) make clear this is an undersirable feature 
only justified by more important considerations One of these considera 
tions IS the moral value of producing a nontnvial social science In any 
such comparative weighing of competing values the degree of each 
becomes relevant for example the magnitude of the deceit In terms of 
pain to the liar (the expenmentcr) while lies are less painful than 
black ones (McGuires active deceit ) ind while they may be equally 
damaging to the recipient he would likewise judge them less immoral 
due to our linguistic legalism Lying of either sort is less painful and 
less immoral when occurring in a setting where it is both expected and 
justified by convention In terms of debasing language and our communal 
ability to depend upon the verbal reports of others (Asch 1952 Camp 
bell 1965) the effect is greater the more that lying is conspicuously 
exhibited by high prestige models For all of these the adaptation level 
created by other segments of social practice provides a relativistic com 
parison base A flagrant he is less immoral introduced into a language 
community where such lies are frequent than when it is a novelty 
In these terms disguised naturalistic experiments vary greatly and 
probably m balance present no greater problems than do laboratory 
ones They probably depend more on nonverbal or white lies less on 
direct deceit They operate typically in arenas of discourse already more 
e ase y deceit than are the halls of leammg If lying is revealed 
e mo eing impact is presumably less than it is m the professor student 
rt a ons ip ut natural settings generally lack the implicit convention 
ot acceptable lying which the psychology laboratory may be achieving “ 


be to anno P^^^bcal way of avoid ng the ethical problem on campus would 
In aboTHf , !»»> '■Bg.nnmg of the term 

™ll fe Inf f •>' P”t.c.pahng thTs semester .. 

von m ^ expenment for the experrmenter to deceive 

IZ as to th eh ‘Lr ™ te able to rnform 

Lhl after all the datlTrthe*™'!!^! ha “ h j 

guarantee that no poss ble danger or .nvas™ of' pnvaty ™U ^ mvolved“ a'nd' 

:!;i’s“re'r:“: r ,e;s 

meoto under these cond. Jns tX; 

generally understood and probably would nnf urex-e. »i. ^ r 

Toxel ^ woma not worsen the problem of awareness 

ana suspiaon that now exists ^ 
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Iicity to minor issues, just because of the relative absence of other com 
mumcahons on the same topic ) Using this laboratory for conformity 
studies (Campbell, 1951) one would abstain from feedbacb of falsified 
public opimon poll results, so readily used m tlie college laboratory, 
and limit one’s comparisons to the presence or absence of feedbaclv, 
and the source (precmct, state, or nation) from uhich the feedback 
came This limitation would in some cases represent a \ ery real sacrifice 
m clanty of expenmental inference, but to present falsified poll results 
would be an mtolerable tampering with the ballot, quite different m 
kmd than had not the action of voting been involved 

Most disguised field expenments provide more limited laboratones 
than this, and are opportunistically hit upon for very specific purposes 
Thus m a conformity study Lefkowitz, Blake, and \fouton (1935j 
modeled walkmg across an mtersecbon against the light, m high status 
or low-status clothmg, and observed the effect upon an observer’s ten- 
dency to violate the traflSc light Schwartz and Skolnick (1962) manipu- 
lated the contents of apphcant bnefs sent to employers of temporary 
summer resort help, studying the effect of a cnminal record upon em 
ployability Schwartz and Orleans (1967) used mcome tax returns to 
measure aroused fear of legal sanctions Bryan and Test (1966) created 
the altruism opportunity of helpmg a woman with a flat fare wth and 
Without a prior helpmg model Page (1958) randomly apphed motivating 
comments on student papers and measured impressive effects on later 
classroom tests Doob and Gross (1968) used the horn honbng response 
of the car behmd and the expenmental treatments of fading to go when 
the hght went green, m high status versus low-status cars 

some of these are such restncled laboratones that one can 
hardly imagine any other problem being studied in them, some 
he more broadly used Thus the Schwartz and Skolnick setting could 
be used for a wide vanety of topics in impression formation or e 
presentation of self, albeit ^vlth a very impoverished dimensionality of 
effect measures Even a technique scemmgly so narrowly focuse on 
honesty as the lost letter techmque (Memtt and Fowler. 1948) Jcncis 
itself to the addibon of many other vanables By addressing enie opw 
fo attitude-relevant croups, Mdgram (Mdgram, Mann, and a er, 

1969) has obtained performance measures of 
f g'y vahdity By leavmg the envelopes unsealed. Gross h 

“ra able to use vanahons in letter content to manipulate a vanety o 

'enables 

^ Employment Essenbal to expcnmcntation is arbitraiy , 

sonic segment of a person’s time It is this arbitranncss « „(] 

*^asal bnbs between past conditions and espcnmcnta rca 
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communications containing fictitious facts on important topics, but is 
probably not exceeded in most persuasion studies In experimental social 
psychology, we are doomed to wear out our laboratories For this reason 
we are already leaving the college in favor of the high school, the gram- 
mar school, and the street Publicity will eventually contaminate these 
laboratories too, but this process will be greatly increased, and public 
anger over the decephon not reduced, by debriefing in disguised natu- 
ralistic experiments 

3 A range of classic studies Gosnell (1927) sent persuasive mes- 
sages to registered voters urgmg them to vote, and used precinct 
records to later determine whether or not the member of different experi- 
mental and control groups had voted, achieving an entirely mconspicu- 
ous experiment using a range of communications well withm normrtl 
limits While today there would be distrust of Chicago’s precinct records, 
other cities’ are still useable Here is a laboratory which should have 
been reused a hundred times by now, but so far as I know, it has 
not been reused even once While the topic is very narrow, the per- 
suasive messages could vary along a wide range of the experimental 
dimensions utilized in laboratory persuasion studies The value of this 
laboratory would greatly increase if one could use how the person voted 
as well as whether he voted While this information is not public for 
individuals it is public for precincts as a whole, and this became the 
sampling unit for Hartmann’s (1936) classic study of rational versus 
emotional political leaflets Used in a state like California in which voters 
get to vote on issues as well as persons a very wide range of persuasion 
theories could be tested Again, Hartmann’s laboratory has not been 
reused In these studies, permission and debnefing would seem totally 
unwarranted, unless the content of the communications contained libel 
or falsehood, and if so debriefing after the elecbon would certainly 
raise a storm of justified protest Thus the range of experimental stimuli 
IS certainly hmited ^but could still cover one sided versus two sided 
communications, extremity of position advocated, or degree of adulatory 
vocabulary Gosnell and Hartmann were both advocating sides they 
genuinely believed m (Hartmann was himself running for mayor on 
the Socialist ticket ) This sincenty, and die related nondeception, would 
be lost in pnmacy recency studies in whidi both of opposing alternatives 
are advocated unless experimenters of opposing advocacies collaborated, 
or unless an experimenter got endorsements for the appropriate messages 
from the opposmg sides This is moving into the white lie area, but 
on the other hand all that need be manipulated is the when and to whom 
of messages that are going to get partial and haphazard distribution 
anyway (In such studies one would often give disproportionate pub 
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of varying size and noted the number of passers by \^ere tlierebv 
attracted Sommers (1959) approach to interpersonal space lends iLse^ 
to such settmgs, through asking strangers qucitions sitting next to them 
on public conveyances, at cafeteria tables, etc The possibilities are wide 
m range, mcludmg the ethically unacceptable Rumors ahead) proMde 
reports of epileptic seizures enacted on streets of expenmenbal taxi 
dnvers mtroducing unexplained delays to frustrate anxious passengers, 
and the hke 


6 Sample solicitation For persuasion studies, a broadl) Sexible dis 
guised laboratory is provided by all Aose settings m which custom 
sanctions make appeals to strangers Intrinsically, a response measure 
IS made available in the natural response to the appeal Sellmg fund 
raising, and petition circulating exemplify the sanctioned goals, direct 
mail, telephone, and door-to door contacts, the sanctioned means Survey 
research estabhshments provide a readily transferable sampling technol 
ogy and staflF (and what more poetic reversal than to have public opinion 
surveyors pose as salesmen) Door-lo door or letter to letter vanations 
m the persuasive appeal provide an elegant opportumt) for random 
equivalence without respondents bemg aware of experimentation (Spa 
hal separation of comparison groups receiving different appeals would 
often be desirable to avoid suspicion through respondents comparing 
experiences ) It is a commentary on the ethics of white lies that the 
experimenter and the door bell nngers would feel better about the dc 
ception if a genume mterest m the fund collection or the product pro 
niotion could be incorporated, as it often could be by offering ones 
services to the relevant causes Note here an additional financial advan- 


tage, m that costs of-collection are Icgitimatel) deductible from the pr^ 
ceeds in much chantable fund raising Salesmens commissions would 
have a similar role These “sincere” fafades would also expedite getting 
the solicitation permits that most police departments now require One 
can envisage pnmacy-recency studies in which funds were solicited alter- 
nately for the White Citizen’s League and the Black Power Coalition 
One could study the effect of degrees of fear arousing appeals for nuclear 
disarmament on the sales of air-raid-shcUcr construction plans 
Tor fund raising, the amount given provides a relevant quanti cation 
and the comments of even the noncontnbutors can be grac 
ablcness For sales, the dichotomous sale no sale can be cnndi^ tiv 
a senes of Guttman scale steps, through offenng postca " 7 n^r- 
c used for postponed purchase decisions, booklets with a itiona i 
">^'■ 00 . etc For pctihons, the natun.! measure .s d.chotomons, hut not 
“"useable on that account, and comments arc codeable “ C 
^nr\c)-s, (although face-to-face recording of comments wou c >e 
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which makes possible randomly assigning equivalent samples to different 
treatments The greater this arbitrar)' control, the greater the multipur- 
pose experimental utihty One such sethng is provided by the employ- 
ment situation I will neglect here its use m applied experiments focused 
on the employer’s problems and admmistrative options (eg, Feldman, 
1937, Kerr, 1945), and focus instead on the uses of employment for 
research in theoretical social psychology Adams (1963) has set an out- 
standing example in his studies of pay inequity and work produced 
Typical IS his use of short term part time employees, in which the wages 
paid represent a research cost of the same order of magnitude as paying 
subjects in the manifest laboratory Stuart Cook (1964) has used this 
setUng m his classic study (as yet unpublished) of the effect of equal- 
status contact on race attitudes 


The admirable study of Rokeach and Mezei (1966) used a related 
setting, the employment agency, m replicating a finding already demon- 
strated m more artificial laboratones (Close inspection of their results, 
however, suggests some degree of leaning-over-backwards m the direc- 
tion of fair play in interracial contacts, a trend possibly symptomatic 
ot reactive arrangements ) ^ j s 

In these illustrative studies, no extreme or damaging treatments were 

“ *‘^'’*‘^“'">8 of experiences that some 
wonU have been or might well have been, exposed to anyrvay. For 

unnecessary, if not unwise, and 
S thf rh,rp ‘’f T' "’■"‘"'ul But this is of course a matter 

a “ oontrast the use of mrhtary 

S Tv° “ nnminent death {Berkun( 

B alek, Kern, and Yagi, 1962, Daily Palo Alto Times, 1959 Areyle 1960) 

a landmark in unethical excess of scenhBc zeal ’’ 

eLuZ'ZuuZ '’.f 1" “ considerable range of 

of strangers Bryan and Test ( lOe^'T encounters 

:flh^m?rdet5““ 

m which a somewhat unusuaUy SrdTut 

TyT'zrtrrrrf 

tL\femtcrg“a rukTy 

* 1 , ^ a , ® ° the women questions 

about their reactions to the different kmclc rtf rtrv i a\ w i 

Biekman and Berkoivitz (.nbnntted)\tttpt“Ztr 
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not exceed the fnistratingness of the regular range of female customers, 
some 30 per cent of whom do not make purchases in any given visit 
Given sufficient budget and the use of a large number of e.'^erimental 
customers, a desirable feature in any event (Hammond, 1954; Bmnswik, 
1956), a final purchase could be included without social waste. The 
setting is one in which the social norms for deceit have already been 
debased not only through deceit in salesmanship, but also through the 
use of pseudo-customers to check on employee courtesy, effectiveness, 
and honesty, entrapments which also invade privacy by attaching the 
acts to the salesmans name. In contrast, the research customer provides 
complete privacy and anonymity. Debriefing would probably not reduce 
the salesmans frustration, but merely change its target. For a profes- 
sional proud of his sophistication and cynicism, it would be painful 
to learn he’d been had. Some damage to the future utility of natural 
settings would result even from the salesman's private communications, 
and the possibility of journalistic publicity would be greatly enhanced. 
The nature of the experimental treatment is the crucial factor, and those 
treatments requiring debriefing should probably not be used anyway 
^vithout the respondent’s permission. On the offier hand there are the 
ethical values of a relevant and dependable social science, and our des- 
perate shortage of appropriate laboratories. 

7. Artifacts. A spirit of advocacy has slipped in to the presentation 
of the previous paragraphs, but this must not be allowed to blind us 
to the fact that disguised field experiments share the epistemological 
predicament described so pessimistically in the earlier sections of this 
paper. It is only the one family of artifacts related to awareness of 
participating in an experiment that is controlled. Of the artifacts treated 
in this volume, it is obvious that e.xperimenter effects will be likely in 
most of the natural settings here described, aggravated in some by tlie 
experimenter having to record verbal reactions after having left the 
respondent. For each of them, the treatment variable and tlie response 
measure will turn out to be conceptually comple.x, with irrelevant aspects 
frequently responsible for the results, either as main effects or as modify- 
mg interactions. For the natural response measures involved, Webb and 
associates (1966), in spite of their generally optimistic tone, have pro- 
'^ded detailed grounds for pessimism. In the end, expandc -conten 
Wntrol groups or replication with varied treatments and measures vu 
required just as they have been for laboraloi}' studies. 


V. SUMMARY 

The logic of scientific inference indicates that „ 

P«>ve theories, but only probe them. For evcr>' thcoo'-corroboratmg 
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In some settings, a mild and a strong version of the petition could 
be offered without reducing plausibihty 

Blake and his associates (Blake, Mouton, and Ham, 1956, Helson, 
Blake and Mouton, 1958) have pioneered the use of petitions in 
nonlaboratory campus studies as have also Gore and Rotter (1963) 
In the advertising industry there are some highly applied experiments 
i\ith direct mail advertising Cook and Insko (1968) have used mailed 
letters of differing contents as experimental treatments Brock (1965) 
has used a salesman in a store to administer varying experimental treat- 
ments It is probable that door to door sales companies have done some 
deliberate experimentation in techniques But by and large, this vast 
range of possibilities has not been utilized for the purposes of science 
While opinion surveys would seem apt to invoke a “guinea pig” effect, 
they have become enough a part of the public scene so that several 
theoreUcally onented experimenters have used them to present vaned 
^eatments in disguised field expenments (eg , Abelson and Miller, 1967, 
Freedman and Fraser 1966, Miller and Levy, 1967) They are less dis- 
pised than sample solicitations in general Artifices have to be added 
to introduce persuasive content or other treatment, while solicitation 
1 occasion for persuasion On the other hand they offer the 
speaal advantage of justifymg verba! attitude measures 
niSFf I' salesmen may be expenmenters The old civil 

“ e^penmental paradigm 

ternfhtZrr to'^hous.^ 

Wilkim and y 1 ^ in restaurants (Kutner, 

customeVrbellv <1950) experimentally varied 

XTce Tunv hTqV" ' ® willingness of druggists to give medical 

customeL ot va b/ presenting automobile dealers with 

customers of varying degrees of gullibility Feldman (1968) varied the 

yr t Hr 

S“ '.“’.ri"'*); " "'.7«5 

fdeb'eaheX “ SdrXthtTnl 

to aebnei tne salesman and remunerate tiim v. x a 

these procedures are the following considerations ‘^li'e trSntTes 
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experimental result there are an infinity of rival explanations potentially 
available, a few of which we must attend to because they are both 
exphcitl)’ advocated and have a plausibility comparable to that of the 
theory corroborated A major class of these plausible rival hypotheses 
are methodological artifacts introduced through irrelevant vehicular as- 
pects ol the expenmental treatment or the measunng device, either as 
main effects or interactions 

Control can never be complete in ruling out all plausible rival hypothe- 
ses in advance As a rule, research must seek out ways of controlling 
each artifact as it is developing, tfirough means that are specific to 
each combination of artifact hypothesis and theoretical variable But 
general-purpose controls are discovered for recurrent classes of artifacts, 
and these become the empmcally developed methodological requirements 
of a field General strategies of control include expanding the content 
of the control group, var)ing the vehicular irrelevancies of the treatment 
variable, varying the method of measurement, and supplementary varia- 
tion m data quality 

Because most of the important artifact hypotheses in laboratory social 
psychology are made possible by the respondent's awareness that he 
IS participating in an experiment, attenbon is given to the techniques 
and ethics of disguised experiments in natural, nonlaboratory settings 
Such experiments do not avoid the general artifact problem, but just 
this one type 
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